- Volume 30 Number 1 March 1979 


BJPIAS 30 (No.1) r-105 (1979) 188N 0007-0882 


č 


The British 
“Journal for the 
Philosophy of 


Science 


Pid for the 
British Society for the Philosophy of Science 
by Aberdeen University Press | 


NE < = 
tee. "1 eS i, 0 





THE BRITISH JOURNAL FOR THE 
PHILOSOPHY OF SCIENCE 


Editors: JOHN WATKINS and JOHN WORRALL 


Subscriptions g RL X F, 

The Journal is published quarterly. The subscription price for mailing addresses 
in Britain is £7.00 net (U.S.A. and Canada U.S. $22.00, other overseas £11.00) 
per volume post free; single issues £2.50 net (U.S.A. and Canada U.S. 88.00, other 
overseas £4.00) plus postage. For members of the British Society for the Phil- 
osophy of Science—who should order direct from the publisher—the subscription 
price is £4.50 (U.S.A. and Canada U.S. $14.00, other overseas £7.00). Member- 
ship of the Society is international and is by no means confined to people resident in 
the United Kingdom. Orders may be sent to any bookseller or direct to Aberdeen 
University Press, Farmers Hall, Aberdeen, Scotland AB9 2xT. 


Notes to Contributors 

Articles, correspondence and books for review should be sent to John 
Worrall at the Department of Philosophy, London School of Economics, 
Houghton Street, Aldwych, London wc2a2az. Articles should be accompan- 
ied by abstracts of about roo words. 


Material is accepted for publication in the Journal on the condition that copyright 
is assigned to the British Society for the Philosophy of Science. 


Three copies of all contributions should be sent, typed on one side only with wide 
margins and double-spaced. Footnotes, which should be as few as possible, should 
be typed separately, double-spaced at the end of the text. They should be referred 
to by consecutive numbering throughout the text, in which they will be inserted 
in page proof. Bibliographical references should be listed alphabetically at the end. 
The style of references should follow current Journal practice. 


Long quotations are printed in small type without quotes, and should be so marked 
in the typescript. Single quotes should be used to form names of words and for 
shorter quotations; double quotes for other purposes. Diagrams, tables, and illus- 
trations should be on separate sheets, with their desired position in the text 
clearly indicated. Symbolic formulae appearing as separate lines should be clearly 
distinguished in the text, and foreign letters (Greek, German, etc.) indicated in 
the margin. Material to be printed in italics, especially English letters used as 
symbols (‘'A’, ‘x’, etc.), should be underlined. On matters of style and layout, 
authors should consult CHAUNDY, T. W. et al. [1¢64]: The Printing of Mathematics. 
London: Oxford University Press. 

Offprints of articles may be ordered when corrected galley proofs are returned. 
Fifty are supplied free ; the price of extra copies is given on the order form. 


© British Society for the Philosophy of Science 1979 


For permission to reproduce materal from The British Journal for the Philosophy 
of Science, please apply to the Editors. 


Back Volumes 

Volumes 1-16 may be obtained from Kraus Reprint Ltd., FL-9491 Nendeln, 
Liechtenstein or Kraus Reprint Corporation, 16 East 46 Street, New York, N.Y. 
10017. Volumes 17-29 may be obtained from Aberdeen University Press, 
Farmers Hall, Aberdeen, Scotland AB9 2XT. Prices on application. 


| Brit. J. Phil. Sci. 30 (1979), 1-25 Printed in Great Britain - I 


Cuk- H0630 -2) -P8627 


‘Towards a Theory of Mathematical 
Research Programmes (I)* 


by MICHAEL HALLETT 


Introduction . 

1 The Development of Mathematical Theories 
(a) Progress in Informal Mathematics 
(6) A Criterion of Mathematical Progress 


2 Why Did Cantor’s Theory of Ordinal Numbers Constitute Mathematical 
Progress? 
(a) Cantor’s Postulation of Transfinite Ordinal Numbers 
(b) Mittag—Leffier’s Theorem 
(c) Borel’s 1895 Proof of the Heine-Borel Theorem 


3 Conceptual Innovation and the Role of Proof in Mathematics 


4 Does Hilbert’s Criterion Reflect Mathematicians’ Reactions to Theory 
Change? 

5 Heuristics and Mathematical Programmes 
Conclusion 


INTRODUCTION 
The main aim of the methodology of scientific research programmes 
(henceforth MSRP) is to set out clear, general desiderata for distinguishing 
between progressive and degenerating science which are in accord (more 
or less) with the informed, intuitive assessments scientists make in particular 
cases.! In this paper I want to consider whether a modified form of that 
methodology can perform the same function with respect to mathematics. 
Mathematicians generally have an intuitive feeling for. what constitutes 
* Previous versions of this paper were read and criticised by John Bell, John Mayberry, 
Jerry Ravetz, John Worrall and Elie Zahar. I would like to thank them all, particularly 
John Worrall, for their valuable suggestions, though naturally none of them will agree 
with everything I say in this present version. 
1 For historical case studies, see the excellent contributions to Howson (ed.) [1976]. For 
the methodology of research programmes itself, see Lakatos [1970]. 
Received 21 March 1977 
A 


2 Michael Hallett 


good mathematics (see x below) but find it difficult to give an adequate 

characterisation of the general principles underlying that feeling. Can a 

methodology of mathematical research programmes supply this general 

characterisation? 

Obviously not, unless mathematical theories develop through different 
‘versions’ in the same way as do scientific theories according to Lakatos. 
But this is, I believe, a historical fact. (For instance, Lakatos’s ‘Proofs 
and Refutations’ traces the development of the theory of polyhedra 
through various versions). Hence the answer to the above question revolves 
around two separate issues. (1) Given a series of different versions of a 
developing mathematical theory, is it possible to specify conditions of 
acceptability (or progress or non-adhocness) of one version with respect 
to its predecessor similar to those given by MSRP for theoretical shifts in 
physical theories? (2) Do developing mathematical theories exhibit the 
kind of heuristic unity or heuristic evolution which according to MSRP 
is exhibited by programmes in physics? 

(2) is, I think, less problematic than (1), and it is not too difficult to 
argue for a positive answer to it. Consequently, I will deal with (2) only 
at the end of the paper (in 5) and then only rather briefly. 

The much more problematic and contentious question is (1). It might 
be argued that because mathematics and physical science are radically 
different the answer to (1) must be negative. For example, proofs play a 
highly important role in mathematics, but a much less significant role in 
physics.) On the other hand physics can rely on a growing fund of un- 
problematic statements of fact (observation reports of meter readings, 
viewings through telescopes etc.) for which there is no analogue in mathe- 
matics. According to MSRP, significant progress is attained only when a 
programme makes some novel predictions which agree with the facts. If 
there are no analogous mathematical facts, how can there be a methodology 
of mathematical research programmes? 

Certainly these differences between mathematics and empirical science 
are highly important. I do not want to claim, as some philosophers have, 
that mathematics and science are really part of one enterprise. But after 
all, I am only looking for a criterion of mathematical progress analogous 
to MSRP’s criterion of scientific progress. These considerations only 
show, as one might expect, that the two criteria cannot be identical. 

1 Physics does occasionally make use of impossibility proofs (e.g. Rosenthal’s proof of 
the impossibility of an ergodic gas system) and, of course, it continually makes use of 
deductions from theories and initial conditions to predictions. But the impossibility 
results are usually lifted directly from mathematics as special cases of quite general 
theorems (for the example of Rosenthal’s proof, which was pointed out to me by Peter 


Clark, see Brush [1967], pp. 168-83). As for strict deduction this is only a small part 
of the role of mathematical proof: see below, 3. 
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I THE DEVELOPMENT OF MATHEMATICAL THEORIES 
(a) Progress in Informal Mathematics 


Mathematicians, historians and philosophers frequently characterise good 
mathematics as mathematics which is important, significant relevant or 
deep, and bad mathematics as that which is none of these. Now while in 
specific contexts it may be intuitively clear what is meant by ‘relevance’, 
‘significance’, etc., the methodologist’s task, as I construe it, is to try to 
make these intuitive notions more precise, and less open to personal 
variation, 

It 1s not difficult to put bounds on the notions by giving examples of, 
say, theories whose results are clearly relevant or important, or new 
theories which clearly have nothing new or interesting to say. Thus, no- 
one would deny that if a theory finds applications in a progressing physical 
programme then this is a very strong demonstration of its ‘relevance’ or 
‘importance’. On the other hand, it is not hard to find historical examples 
of theories going through periods of obvious stagnation. The theory of 
algebraic forms in the period leading up to 1890 is one such example.! 
Lakatos gives another in his history of the Euler conjecture. He describes 
a period where the search was on for a ‘complete all-embracing’ formula 
covering all know polyhedra. Complicated formulae were constructed to 
deal with the latest counterexample only to be refuted by a new counter- 
example and replaced by a new formula. But each time the new formula 
was constructed from the old by simply adding terms (parameters) and 
filling in the result from the latest counterexample. Each new and more 
complicated version of the conjecture does nothing more than ‘save the 
phenomena’. As Lakatos says, such generalisations are ‘cheap’; they are 
ad hoc in a way which the philosopher of science would clearly recognise.? 

It is tempting to take some very clear-cut cases and turn them into 
definitions. Von Neumann apparently succumbs to this temptation in his 
[1947]. Weyl had earlier suggested that classical mathematics can only be 
given a sense when fused with physics. Comparing Hilbert’s programme 
with Brouwer’s in his [1925], Weyl asserted that the former is only a serious 
rival to the latter if the concepts of classical mathematics which it seeks to 
defend are given physical interpretation. For, he claims, even a consistent 
mathematics is no good unless it has some meaning.’ Von Neumann, 
perhaps frustrated with the failure of Hilbert’s programme to prove 
classical mathematics consistent, seems to follow Weyl’s suggestion. 


1 See Bell [1945], pp. 421-2 and Fisher [1966~7], section II. 
2 See Lakatos [1976], pp. 79-81 (especially n. 1, p. 80) and pp. 97-8; see also p. 7 below, 
3 See op. cit. p. 541. 


4 Michael Hallett 


Worried that much purely formal mathematics is apparently just Part 

pour l’art, von Neumann suggests that the degeneration of a theory sets 

in when the theory is ‘at a great distance from its empirical source’ ([1947], 

p. 196), and that stagnation or degeneration can be halted or reversed with 

‘the re-injection of more or less directly empirical ideas’ (2b:d.).1 Now 

while the idea of “empirical relevance’ is in some ways very attractive, it is 

by no means as definite as it seems, and, if insisted upon as a necessary 
condition for good mathematics, is far too restrictive.* 

In the first place, it is not clear where the line is to be drawn. If a theory 
A is directly interpretable via a physical theory P, then certainly according 
to Weyl and von Neumann it is significant. Now assume that a theory B 
is used to solve problems in the ‘significant’ theory A. Does this make B 
significant, or must it find direct physical interpretation as well? Secondly, 
it 18 too narrow as a piece of advice to working mathematicians. Indeed the 
dictum “work only on those theories which are physically interpretable’ 
would actually be counterproductive: it is impossible to say in advance 
whether or not a theory will be physically interpretable. Who would have 
said at the time of their proposal that differential geometry or the theory 
of Hilbert space would turn out to have applications in the theory of 
gravitation and quantum mechanics? Surely such theories were judged 
not according to their physical potential, but according to their qualities 
as mathematical theories. Thirdly, following on from this, the proposal is 
too narrow because it does not capture the actual history of mathematics. 
For example, the theory of algebraic forms which had made little headway 
up to 1890 was revived by Hilbert’s solution of a fundamental problem 
via a purely existential proof which very quickly led to a much more 
general theory of algebraic invariants.$ And again, the theory of polyhedra 
was revived by Poincaré’s abstract proof which led to the modern theory 
of combinatorial topology.* Neither of these historically clear examples of 
revival and progress were the result of ‘empirical injections’. 

Similar problems arise with the constructivist thesis. Constructivism 
may be more definite than the ‘empirical relevance’ thesis, since (at least 
according to some versions of the doctrine) it is possible to give in advance 
specific indications as to what is constructive and what is not.’ But it 
1 Cf. Weyl [1949], p. 235. 

8 It should be pointed out that von Neumann probably did not intend ‘empirical relevance’ 
to be a rigid acceptability criterion, but rather a condition to be applied when theories 
become too ‘baroque’, or fall under the influence of mathematicians who do not have 
‘an exceptionally well-developed taste’. 

8 See Fisher [1966-7], sections II and III. 

4 In Lakatos’s list of formulae (r)-{7) on pp. 80-1 of his [1976], it is (5) which became the 
standard ‘Euler formula’; see the discussion in thid. pp. 97-8. It was (s) which was 


proved by Poincaré (though for arbitrary dimensions). See also Bell [1945], pp. 459~60. 
5 See for example the Preface and chapter 1 of Bishop [1967]. 
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would appear that constructive mathematics is already too weak for some 
parts of physics (certainly if physics makes extensive use of topology, 
for example in cosmology"). And certainly it is not sufficient for explaining 
why large parts of classical mathematics were accepted, even those parts 
which can be given a constructivist interpretation. 

Unfortunately, then, neither of these attempts to characterise relevance 
as “empirical relevance’ or ‘numerical relevance’ (t.e. constructivism: see 
the Preface to Bishop [1967]) is adequate. What is required is a more 
general characterisation of mathematical significance which perhaps 
includes these as special cases, but which also takes in such examples as 
Poincaré’s proof of the generalised Euler conjecture (or rather one of the 
generalisations of Euler’s conjecture), and at the same time excludes the 
ad hoc theoretical modifications which preceded it. But just what is it 
about Poincaré’s theorem and its proof which makes it ‘good’ mathe- 
matics; why does it stand out from the previous ‘cheap generalisations’? 
The answer, I suggest, is connected with the fact that Poincaré’s result 
does much more than just solve the problem it was originally constructed 
to solve. It turned out that Poincaré’s proof and the framework (definitions 
etc.) that it employs can be applied not just to geometric objects (polyhedra), 
but to analytic objects (manifolds) as well. Combinatorial topology began 
with the imposition of geometric (polyhedral) form on analytic manifolds. 
Once this is done, the proof of the Euler formula becomes a direct com- 
binatorial method of calculating the Betti numbers of the manifold, and 
thus of classifying it. This is a far cry from the previous ‘cheap generalisa- 
tions’. Not only did Poincaré’s result solve a problem in ‘solid geometry’ 
but it solved a great many more problems besides.* The same is true of the 
Hilbert result which saved the theory of algebraic forms from degeneration. 
According to Fisher the Hilbert theorem was originally intended as a 
solution to a specific problem in the theory of algebraic forms, but turned 
out to be applicable to a great many more algebraic problems. So much so 
that the theorem eventually became one of the cornerstones of modern 
algebra.4 Similar remarks apply to the Heine-Borel theorem, to take an 
example from analysis. The Heine-Borel property (the existence of a 
finite sub-covering) was originally established for countable coverings of 
a closed interval on the real line.5 But the property subsequently turned 
1 According to Bishop: ‘Very little is left of general topology after that vehicle of classical 

mathematics has been taken apart and reassembled constructively. With some regret, 


plus a large measure of relief, we see this flamboyant engine collapse to constructive size’ 
(op. cit., p. 63). 

* For further discussion see section §. 

3 For a brief indication of how Poincaré’s theory was applied by SchSnflies and Brouwer, 
see Wilder [1953], pp. 433-4, or Alexandroff [1932], pp. 11-12, 30-1. 

4 See Fisher [1966-7], sections II and III. 5 See pp. 20-5 below, 
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out to be satisfied in so many other cases, that eventually the whole class 
of compact spaces was isolated and studied as a separate category. 

In all these cases, the new results were applied in unexpected ways, to 
solve problems in circumstances they were not originally designed to 
cover. It is relatively easy to construct theories or design theorems for 
narrowly prescribed purposes; it is much more difficult to construct 
theories or theorems which subsequently turn out to have mathematical 
applications elsewhere. These cases deserve special merit. For it is through 
these that the various back connections and similarities of mathematics 
are discovered.* 

This idea is by no means new. One can find hints of it in Gödel [1964], 
in Hadamard [1949], in Weyl [1951], in Wilder [1953] and more recently 
in Dieudonné [1975]. But it was most strikingly put by Hilbert: 

The final test of every new mathematical theory is its success in answering pre- 


existent questions that the theory was not designed to answer. By their fruits 
ye shall know them—that applies also to theories.? 


I take this remark of Hilbert’s as my starting point in what follows. I 
will try to show how with some elaborations this comment can be turned 
into a criterion of progress for mathematical theories very like MSRP’s 
criterion of progress for empirical programmes. 


(b) A Criterion of Mathematical Progress 


As I have pointed out, there is no clear analogue of empirical data in 
mathematics. The first important feature of Hilbert’s comment is that it 
suggests a quite natural way around this difficulty. Hilbert suggests that 
a mathematical theory gains its support via the problems it solves, and thus 
that problem solutions constitute an analogue for correct predictions of 
empirical data. Empirical theories are set up to capture truths about the 
world, t.e. facts; hence appraisals of empirical theories are largely based 
on comparing their relationship to facts. Mathematical theories are set 
up to solve, through their consequences, mathematical problems, so it is 
quite natural as Hilbert suggests that their merit be judged on the basis of 
their problem-solving ability. If it is possible to give reasonably precise 
systems of appraisal based on this substitute for empirical data, then the 
absence of mathematical ‘facts’ poses no serious problem for the 
methodologist of mathematics. Let us then see if it ts possible. 
Methodologies which accept that it is the facts if anything which lend 
a scientific theory support invariably stress that real support cannot be 
1 For the importance of establishing such ‘back connections’ see Wilder [1953], especially 
his discussion of the evolution of the concept of curve, and the related diagram on 


p. 431. 
? Hilbert [1926], p. 384. 
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achieved too easily. Hilbert’s conditions likewise demand that not every 
problem solved by a theory should count in its favour. To this end Hilbert 
imposes two conditions. He demands that to demonstrate its worth a 
theory must solve a problem that pre-dated it, and that the theory must 
not have been designed to do this. As I see it, Hilbert’s first condition is 
intended as a minimal guarantee that the theory solves an ‘important’ 
problem, while the second condition is intended as a guarantee that the 
theory does not solve this problem in an ad hoc way. I will come back to 
the question of ‘importance’ below; for the moment I want to consider the 
question of ad hoc-ness. 

Hilbert’s formulation of his second condition is rather ambiguous. 
Reference to a theory’s ‘design’ might be taken as a reference to the 
intentions of its creator. If this is the case then it seems to me that Hilbert’s 
condition misses its mark. For example, Cantor certainly designed his 
theory of transfinite numbers with the intention of solving the continuum 
problem. Now in fact the theory did not solve this problem. But I fail 
to see why if it had Cantor’s intention ought to exclude the solution of this 
problem from being a ‘fruit’ of the new theory. However, if one interprets 
Hilbert’s intention as simply to guard against a theory having built in 
success (and this will be my interpretation in what follows) then it is not 
dificult to give Hilbert’s condition a precise non-psychologistic 
interpretation. 

Problems are solved by specific statements entailed by theories. If one 
knows in advance that a particular statement or set of statements will solve 
a given problem then there is the possibility in constructing a new theory 
of building the statements into the theory, thus guaranteeing its ‘success’ 
in solving the problem. In other words the set of problem-solving state- 
ments (or the problem solution for short) has been used in the construction 
of the theory. Now one doesn’t want to say that such a move is by itself 
bad; but clearly as far as the solution of the problem in question is con- 
cerned the new theory has taught us nothing. Consequently, this particular 
solution shouldn’t count as a genuine success for the theory. Its genuine 
‘successes, if any, must be among the other problems it solves. 

Note that what makes many of the revised polyhedral theories discussed 
by Lakatos ad hoc is that their only ‘success’ is built-in success. The 
method of exception barring discussed on pages 24-30 of Lakatos’s [1976] 
creates rather extreme examples of theories of this kind. For instance, in 
Lakatos’s dialogue pupil Alpha proposes (p. 15) a twin tetrahedron (two 
tetrahedra joined at one vertex, or alternatively along one edge, for which 
V—E-4-F = 3) as a counterexample to Euler’s conjecture that V—E+-F' = 2 

1 See a(a) below. 
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for all polyhedra, V, E and F being the number of the polyhedron’s 
vertices, edges and faces respectively. Applying Lhuilier’s and Gergonne’s 
method of exception barring (Lakatos [1976], p. 26, n. 2; the method is 
advocated in the dialogue by pupil Beta) to resolve this conflict yields the 
new conjecture: V—E-+-F = 2 for all polyhedra except twin tetrahedra. If 
challenged to devise a theory which also covers twin tetrahedra, the excep- 
tion barrier would presumably reply with: V—E+F = 2 for all polyhedra 
except twin tetrahedra, for which V—E+F = 3. Both these conjectures 
solve the given problem by building in the ‘facts’ as face-saving clauses. 
But this clearly exhausts their problem solving capacity: for polyhedra 
other than twin-tetrahedra they refer back to Euler’s conjecture. Such 
exception-barring conjectures teach us nothing new: they simply rearrange 
existing knowledge so as to avoid refutations.1 

Once the nature of this ad hocness is recognised it is not difficult to 
specify a condition which guards against it. Indeed, this problem has been 
met with in the methodology of empirical science and tackled by Zahar 
and Worrall.? Thus if a theory solves a problem and the statement solving 
the problem was not used in the construction of the theory, then the 
solution is not ad hoc. To refer to Cantor’s continuum problem again, 
although Cantor conjectured and firmly believed the statement ‘only two 
infinite powers are represented in the linear continuum’ this statement was 
not used in the construction of his theory of transfinite numbers. Thus, if 
the theory had solved the continuum problem (as he intended!) this 
solution would have been non-ad hoc, and should certainly have been 
counted as a genuine success. 

Hilbert’s first condition also has its difficulties, but these are harder to 
overcome. As I said, I think Hilbert’s condition is intended as a minimal 
condition on the importance of a theory. But importance is a tricky issue. 
Clearly any new theory since it will introduce new objects or new concepts 
will create a host of new problems concerning itself. It may happen that 
the new theory is powerful enough to solve some of these problems. But 
although this might demonstrate the ingenuity of the creator of the theory 
or of its devotees, this does not show by itself that the theory is a significant. 
contribution to mathematics. Thus, Hilbert demands that whatever else 


1 Other examples are mentioned on p. 3 above, text to n. 2. These examples are by no 
means as trivial as the exception barring examples. For, although the new conjectures 
are apparently created by adding and adjusting parameters to deal with specific refuta- 
tions, they are nonetheless quite general. Consequently, they have explanatory and 
problem-solving capacity beyond that trivially built into them. However, only in the 
case of conjecture (5) was this capacity used (see above, p. 4, footnote 4, and p. s). 
The rest solved only those problems whose solutions were built into them. 

2 See Zahar [1973], pp. 101-4 or, for a fuller account, Worrall [1978]. See also pp. 9~10 
below. 
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it does a good theory must solve a pre-existent problem. For this guarantees 
that the theory ‘goes beyond’ its own subject matter. The condition in effect 
simply makes explicit the general recognition that one of the characteristic 
features of mathematical progress is the emergence of theories which 
establish fundamental connections between apparently diverse fields.! 
Good theories are thus theories which help to establish such connections. 

But while the intention of Hilbert’s condition is clear, its formulation is, 
I think, too rigid. For, in the first place what if a theory turns out to solve 
an important problem not of its own creation which post-dates it? Surely 
the theory will still have demonstrated its importance even though it 
doesn’t satisfy Hilbert’s condition. However, this difficulty is avoided by 
.dropping any reference to temporality and simply demanding that the 
theory solve some problem not of its own creation, regardless of whether 
this problem pre-dates the theory or is discovered later. 

Secondly, on a slightly different tack, Hilbert’s formulation implies 
that a theory va its consequences must completely solve a problem, that 
is, it must supply all the elements which go to make it up. But this again 
is too strong. Surely, if a theory only supplies part of the solution to a 
problem, the other parts coming from elsewhere, then it still deserves 
credit for its contribution (providing it was not achieved in an ad hoc 
way). This difficulty is also easily avoided by slightly relaxing Hilbert’s 
condition. Instead of demanding that a theory solves a problem, we need 
only demand that it is used in the solution of a problem. 

However, the modified condition is still only a minimal condition, a 
necessary but not sufficient condition on a theory’s importance, since a 
theory might solve a quite trivial problem outside its own domain. 

But I do not regard this situation as necessarily distressing. In the first 
place, if a theory satisfies the Hilbert conditions (as I have modified them) 
then this at least does show that the theory is potentially a valuable tool 
of mathematical discovery. Moreover, in historical investigations of 
theories it may be possible to discover historical evidence for the importance 
of a problem, particularly 1f mathematical theories are indeed arranged as 
programmes with long term research aims.’ 


The two modified Hilbert conditions form the basis of a criterion of 
progress for series of theories which is now (partly by design) quite similar 
to MSRP’s. I propose that the setting up of a new mathematical theory 


1 See p. 6 above, and also Wilder [1953], particularly p. 444. 

2? I discuss an example of this below, namely where the theory of transfinite numbers 
was used in solving a problem of Borel’s which post-dated it. See section a(c). 

8 I give an example of a clearly identifiable important problem below, namely the problem 
of finding analytic expressions to represent analytic functions: see a(b). 
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T;, constitutes progress with respect to its predecessor Tm if Th is used 
in the solution of at least one problem P which T,» did not solve,! provided 
that P is not of T},’s own making and that the statement solving P was 
not used in the construction of Th. (In what follows I will call this 
criterion “Hilbert’s criterion’ for ease of reference.) The similarity with 
MSRP is quite clear. MSRP states that a consequence of T” provides 
support for T” providing it is true and was not used in the construction 
of T”. (This last condition is that of Zahar and Worrall: see the references 
in n. 2, p. 8 above.) Hilbert’s criterion states that a consequence of T’ 
provides support for T” if it is used to solve an ‘important’ problem and 
likewise was not used in the construction of T’. Thus both criteria state 
similar desiderata for progress once theoretical novelty has been achieved. 

Can Hilbert’s criterion be applied to the actual history of mathematics? 
By presenting a detailed case-study I want to show that it can. I apologise 
in advance for the length of the study and for burdening the reader with 
so much detail. In defence, I might say that since the methodology of 
mathematics is still in its infancy detailed case studies are certainly needed. 
Moreover, the case history I use (Cantor’s theory of transfinite numbers) 
raises various difficulties concerning Hilbert’s criterion which will give 
me the opportunity to make the criterion clearer. 


2 WHY DID CANTOR’S THEORY OF ORDINAL NUMBERS 
CONSTITUTE MATHEMATICAL PROGRESS? 


(a) Cantor’s Postulation of Transfinite Ordinal Numbers 


Cantor first introduced the transfinite ordinals in his [1880], the second 
in a series of six papers with the title ‘On infinite linear point-manifolds’.® 
Their introduction marked the beginning of a two-fold attack on the general 
problem of powers, and in particular the problem of the power of the continuum. 
The problem of powers in its general form was that of finding a calculus 
of power or absolute size adequate for discussing and describing the sizes 
of arbitrary infinite sets of points. By 1878 Cantor had clearly adopted the 
principle of using one-one correspondences to measure the relative size of 
sets (a principle which Bolzano and others had earlier rejected); and he 
had already implicitly adopted the idea that infinite sets have an absolute 
size which he called its power. But three questions now arise. How can 
1 That is, did not solve P up to the time of Tw’s introduction. It may be shown sub- 
sequently that Tm can solve P; but see below, a(c) and particularly 3. 
3 The six papers are his [1879], [1880], [1882], [18834], [x883b] and [1884]. 
3 The term power was originally used in the relative sense. Thus Cantor said that ‘A has 
the same power as B’, or ‘A has smaller power than B’ if there is a one-one correspondence 
between A and B, or between A and a subset of B respectively. (See Cantor [1878], 


p. 119.) But then he quickly slips into using the term in an absolute sense, thus hypostatiz- 
ing it. He talks, for instance, of ‘the smallest amongst infinite powers’ (op. cit., p. 120). 
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these infinite powers be represented? Is it possible to set up a scale of size 
and an effective arithmetic for infinite power, just as we have a scale and 
an arithmetic for finite powers? (Cantor had already identified the powers 
of finite sets with the natural numbers: see his [1878], p. 119.) And how 
many powers are there? 

Cantor’s papers [1874] and [1878] provided partial answers to the last 
of these questions. In his [1878] Cantor showed that all Euclidean intervals 
of whatever dimension (even of countable dimensions) have the same power 
as any linear interval. This was a surprising result,! but it nevertheless 
simplified the search for different powers since it showed that all powers 
of Euclidean point sets of whatever dimension must be already represented 
in the real line. Cantor, of course, knew that there were at least two powers 
amongst infinite linear sets, since he had shown in his [1874] that a linear 
interval has a ‘greater power’ than the sequence of integers. Indeed he also 
showed that the algebraic numbers, and thus the rational numbers, have 
the same power as the sequence of integers. This possibly suggested to 
Cantor that there are very ‘few’ infinite powers in Euclidean space. In 
any case, in his [1878] he put forward the strongest possible version of this 
thesis, for he conjectured that there are just two. He conjectured that any 
infinite linear set must either have the power of the natural numbers or 
the power of the whole line. This was the first version of Cantor’s con- 
tinuum hypothesis. 

The desire to prove this conjecture was the main creative spur to Cantor’s 
work from this time on, and the main reason for tackling the general problem 
of powers.* Nevertheless, Cantor admitted the possibility that it might be 
false. Despite his belief in its truth, his aim as a mathematician was to 
decide the issue, first by looking for a proof, and then if a proof was not 
forthcoming, by looking for a disproof. From Cantor’s viewpoint in 1878 
there were two possible ways of approaching the continuum problem. One 
was to try to prove (or disprove) directly that any infinite linear point set is 
either countable or has the power of the continuum. The second was vra 
the general problem of powers, by defining an arithmetical scale of infinite 


1 See the correspondence with Dedekind for the period May 1877 to January 1879 in 
Noether and Cavaillés (eds.) [1937], pp. 21-50. 

2 In his famous report on set theory of 1899 Schénflies remarked of the continuum problem 
that 


... we have this to thank for a large part of his [t.¢. Cantor’s] set-theoretic investiga- 
tions. (Schénflies [1899], p. 49). 


Certainly, as Schônflies later remarked, Cantor struggled with the problem on and off 
throughout his life, and put his ‘highest ability’ into trying to solve it: see Schônflies 
[1927], p: 16. The effects of this struggle are dramatically presented in the excerpts 
from Cantor’s letters to Mittag-Leffler, quoted in Schônflies [1927], and Schdnflies’s 
comments on them. 
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powers, and showing at what place in the scale the power of the continuum 
is represented. If it is represented by the power in second place in the 
scale the continuum hypothesis must be correct; if not, then it must be 
incorrect. Cantor tried both lines of attack, and both involved the trans- 
finite ordinals, though in different ways. The second line of attack led to 
the modern theory of cardinal numbers in which the ordinal numbers are 
used to construct the scale of alephs. But it is the first line of attack which 
concerns us more here since it was for this that the ordinal numbers were 
first introduced. 

Cantor’s direct approach to solving the continuum problem was to set 
up a further method of classifying point-sets alongside the classification 
by one-one correspondence and to try to obtain information about the 
latter vra the former. His further method of classification was by studying 
certain decomposition properties using the idea of point-set derivation. In 
his remarkable [1872]! Cantor introduces the idea of the derived set of a 
point set P denoted by P® and containing exactly the ‘limit points’ (i.e. 
accumulation points) of P. If P is bounded and infinite then according to 
Weierstrass’s theorem P® will be non-empty. Correspondingly, if P™ is 
bounded and infinite it too will have a non-empty derived set (P4)®, 
which Cantor denoted by P®, Cantor extended this to P™ for any finite 
n. This is as far as Cantor went at this stage: the central theorem of his 
[1872] uses only sets P for which P™ = Ø for some n.? However, Cantor 
recognised even this early that there is a much more general heuristic 
method here, namely: to discover properties of P by investigating the 
sequence of derived sets. There are clearly point sets for which P™ + Ø 
for any #, for example an interval, or the rationals in an interval. Thus, in 
order to sub-classify these “second-species’ sets (‘first species’ sets being 
those with P™ = @ for some m)* Cantor required some way of analysing 


1 The central topic of this paper is the problem of the uniqueness of trigonometric series 
expansions of functions. (The history of this problem is described in both Hawkins 
[r970] and Dauben [1971].) But it was in this paper that Cantor first began to use the 
idea of arbitrary infinite sets of points, and where he gives his classical definition of real 
numbers as Cauchy sequences of rationals. 

t The theorem is: 

If f(x) is a function which is represented by a trigonometric series at all points of 
the set [o, 27] —P, where P © [o, 27] such that P™ = (À for some n, then the repre- 
sentation is unique. 

See Cantor [1872], pp. 99-101, or Hawkins [1970], pp. 21-28 or Dauben [1971] 

3 In his [1872] Cantor wrote: 

If a point-set is given in a finite interval, then in general a second point-set is given 
with it, and with this second a third, and so on [Cantor means P™, P™, ete.]. 
These point-sets are essential in order to be able to understand the nature, of the first 
point-set (p. 97; my italics). 

t Cantor introduced the terms ‘first species’ and ‘second species’ in his [1879], p. 140. 
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the derivatives ‘beyond’ the P™. To this end he introduced in his [1880] 
new ‘symbols of infinity’, which were contextually defined vta the deriva- 


@w 
tion process. Thus, P( = fy P™, Poet) = (P()© and so on.! 
n=l 


Cantor’s hope was that this more general idea of derived set would pro- 
vide an exhaustive classification of point-sets discriminating enough to 
give information about the power of any set by reducing it to components 
whose power is known or easily calculable. The introduction and use of 
the symbols in his [1880] and [1883a] constituted the first stage in the 
creation of Cantor’s new theory. But for both logical and mathematical 
reasons Cantor was forced to shift to a second stage in which the ‘symbols’ 
were reintroduced as transfinite ordinal numbers. 

In the first place, Cantor had already begun to use an arithmetic of 
symbols, combining them both with themselves and with the natural 
numbers. To take an example from his [1880], pp. 357-8, he there intro- 
duces P‘™”, P(@®) and more generally P™we%tn oit... tn) for 
natural numbers », #9, . . . ny, This presupposes that the ‘symbols’ and the 
natural numbers are objects of the same kind, subject to the same arith- 
metical operations and obeying generalised arithmetical rules. But this 
meant that the ‘symbols’ had to be presented as numbers and the generalised 
arithmetic spelt out. Secondly, some classification of the symbols them- 
selves was required. The attempt to establish decomposition theorems for 
arbitrary sets via derivation presupposed an investigation of such questions 
as how many new symbols are required to cover all cases. This meant that 
Cantor could no longer rely on ad hoc introduction of ‘symbols’, but re- 
quired general principles for their introduction and classification. Thus in 
his [18835] Cantor re-introduced the symbols in a much more general 
setting as transfinite ordinal numbers postulated to exist in their own 
right away from the context of point set derivation.? These numbers 
extend the natural number sequence and the extended sequence of ordinal 
numbers then satisfies a general ordinal arithmetic. Cantor now had a 
much more versatile point-set theory with its apparatus of recursive 
definitions, transfinite induction and so on, with which to tackle the 
continuum problem, and in particular the problem of decomposing point 
sets by derivation.’ 


1 See Cantor [1880], pp. 3577-8. 

* Cantor in effect postulated the existence of the transfinite ordinal numbers via his two 
so-called generating principles ([1883b], p. 547 and pp. 576-8). Although the problem 
was to worry later set theorists a good deal, Cantor at this stage made no attempt to 
define the numbers in terms of the background set theory. 

3 Although, as I’ve explained, Cantor introduced the ordinals in two stages, when 
referring to ‘Cantor’s new theory’ IJ shall not distinguish between them. 
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Cantor’s new theory is now almost universally regarded as good mathe- 
matics, and a substantial improvement on its predecessor, point-set theory. 
I argue that Hilbert’s criterion can explain this judgment because the 
theory was used to solve ‘important’ problems in a non-ad hoc way. I 
will describe two successful applications of Cantor’s new theory in solving 
important problems: the first application is in the proof of Mittag- 
Leffler’s theorem; the second is Borel’s first proof of the Heine—Borel 
theorem. Later I will indicate that these two applications were of some 
importance in changing mathematicians’ opinions about Cantor’s theory. 
This will suggest that whether they do so consciously or unconsciously 
mathematicians themselves apply something like Hilbert’s criterion in 
appraising new theories. 


(b) Mittag—Leffler’s Theorem 


In 1884 Mittag-Leffler made substantial use of Cantor’s new theory in 
extending certain fundamental theorems due to himself and Weierstrass 
concerning the representation of analytic functions. 

The representation problem grew out of Weierstrass’s definition of 
analytic functions in terms of power-series. Each power-series is taken to 
represent an analytic function inside its radius of convergence and 
Weierstrass turned this into a general definition of analytic functions 
using the notion of analytic continuation. In fact, he defined an analytic 
function as a “‘monogenous system of power-series’, meaning by this a 
class or set of power-series, the elements, such that between any two points 
a, and a, inside circles of convergence of any two elements it is possible 
to get from a, to a, by analytic continuation. The function so defined is 
then said to be regular in the interior of the union of the discs of con- 
vergence of the elements. 

But this notion of a function is rather unwieldy. In the first place a 
system may well contain uncountably many elements. But more import- 
antly it does not give a global picture of the behaviour of the function. In 
particular, for any given system of power-series it will not generally be 
clear how the function defined by this system behaves in the neighbour- 
hood of singularities, or even where these singularities are? Thus, 
Weierstrass in 1876 raised the following general problem: given a single- 
valued analytic function, is it always possible to find a single analytic 


1 Though in his [1888] Poincaré showed that a Weierstrassian analytic function can always 
be defined by a countable system of power-series simply by considering a countable 
dense set of regular points and taking the power-series based on them. 

4 For an excellent critical discussion of Weierstrass’s conception of analytic functions, see 
Borel [1898], chapter 4. 
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expression (which in general will be an infinite series!) which represents 
the function in its domain of regularity and which actually exhibits the 
singularities of the function by virtue of its form? 

As Weierstrass points out (Weierstrass [1876], pp. 81-3) the problem is 
just that of finding a common form for functions regular in a given domain. 
The problem had been solved in certain simple cases. For example, if 
the point c is the only singularity in the finite plane, then all functions with 
this as the only finite singularity will be representable by some rational 
or transcendental function of 1/(z—c) (depending on whether c is a pole, 
or an essential singularity), plus an entire function (i.e. a function regular in 
the whole finite plane). More generally, finite sums and products of such 
functions can be constructed to cover the case where there are a finite 
number of singularities in the finite plane.? Weierstrass suggested taking 
these known cases as a pointer and proposed solving the general problem 
step by step, using similar methods applied to more and more complicated 
sets of singularities.® 

Weierstrass believed he had solved the problem where the set of 
singularities is made up of a finite number of essential singularities, and 
arbitrarily many, even infinitely many poles.‘ But in fact, as he later 
acknowledged,® he had really only dealt with the case where there are 
finitely many poles, in other words when all the singularities are isolated 
(i.e. can be surrounded by a regular disc). The first step in dealing with 
the infinite case was taken by Mittag-Leffler in 1877. 

Mittag-Leffler succeeded first in solving the problem in the case where 
the singularities form an infinite sequence of poles tending to a limit at oo, 
which is then a non-isolated essential singularity. He showed: 


(x) Let a,,..., Gm ... be a sequence of distinct complex points, with 
lim a, == © 


and let (5) be a corresponding sequence of rational functions, 


each with a single pole at a,, and no other singularities. Then it is 
possible to construct a single-valued analytic function F(z) with poles 


1 An analytic expression will be formed by putting together an infinite series of expressions, 
each formed from the variable z by a finite number of instances of the addition, sub- 
traction, multiplication and quotient operations. ` 

* See Weierstrass [1876], pp. 79~84. The problem of taking the product of finitely many 
transcendental functions (t.e. infinite series) led Weierstrass to his famous factor theorem, 
i.e. forming expressions for functions with infinitely many prescribed zeros. (See op. 
cit., pp. 85—6, or e.g. Hille [1959], pp. 226-9). This is necessary because one or more 
of the transcendental functions might have infinitely many zeros. 

3 See Weierstrass [1876], p. 81. 

4 Ibid., p. 85. 5 See Weierstrass [1880], p. 199. 
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@,,..., @,..., and an essential singularity at oo, such that F(z) has 
the same principal parts as g, at a,. The simplest such F is given by the 
Fa) => | a 


expression 
A z =) —ral2) | 


where the y, are suitably chosen polynomials in 2 and are actually 
constructed from the g,. There are more general functions than F, 
having the required properties, but any F’ having these properties will 
be of the form 





I 


F'(z) = F(2)+G(8) 
where G is an entire function. 
This solves Weierstrass’s problem for this case, and moreover by a fairly 
simple and direct extension of the methods used in the finite case. The 
complications creep in only in choosing the y,(%) in such a way that the 
series 


2 [gn Ya 


converges uniformly in every regular region. 

But now, as Weierstrass showed in his [1880], it is possible to generalise 
the results of his [1876] to the case where there are finitely many essential 
singularities, and infinttely many poles. What Weierstrass had neglected 
in his [1876] was the possibility that one of the essential singularities is the 
limit of some sequence of the poles. In other words, he explicitly assumed 
that the essential singularities (indeed all the singularities) are tsolated. 
Mittag-Leffler’s theorem shows how to get round this problem. First, 
Weierstrass showed that we do not need to assume that the limit of the 
ai, +. Any +. 18 00: the result will hold where hm a, = c for any c 


(which is now an essential singularity) in the extended plane. (Weierstrass 
[1880], pp. 195-6.) Secondly, Weierstrass extended the result to cover a 
finite set of essential singularities c,, ..., ¢, where any or all of the cps 
can be the limit of a sequence of poles (1bid., pp. 196-9). 

Thus all functions whose singularities are given by the points c4, .. ., Cm 
and the points a, as 1 ranges over I, ..., n and f ranges over I, 2,..., 
m,..., and hm a, = c; for each 1, will take the form: 


Jo 
F(z) = F(2)+G(z). 


1 See Osgood [1928], pp. 565-6, or almost any general textbook on function theory. 
(x) is usually referred to in textbooks as ‘Mittag—Leffler’s theorem’; the generalisations 
mentioned below are only given in the more advanced textbooks. The proof standardly 
given for (1) is Weierstrass’s from his [1880]. For a very clear rendering, see Hurwitz 
and Courant [1929]. 

8 See for instance Hille’s remarks on pp. 218-19 of his [1959]. 
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F(z) is the function constructed by Mittag-Leffler’s method out of the g,; 
where each g, has the sole singularity a,,; and G(z) is any function whose 
only singularities are among ¢,, ..., €n. The problem of representing a 
particular F’(z) with these given singularities then reduces to finding a 
particular G(z) with F’(s) = F(z)+G(2). 

So far, Weierstrass and Mittag-Leffler had not required any special 
techniques of point-set theory. Their results were achieved by the in- 
genious use of analytic methods. But it is clear that if Weierstrass’s pro- 
gramme was to be carried any further some detailed analysis of the structure 
of infinite point-sets would be required. What was needed especially, 
as Weierstrass’s difficulty of 1876 shows, was an analysis of the behaviour, 
distribution and concentration of the limit points of infinite sets. This was 
just what Cantor’s theory of 1872 began to develop, and what his new 
theory of 1880 and 1883 sought to extend. Indeed, the theory of point-set 
derivation is precisely the theory of the distribution or concentration of 
point-sets.! Hence it is not surprising that after 1880 having exhausted the 
possibilities of purely analytic techniques Mittag-Leffler began to combine 
these techniques with Cantor’s theory in his attempts to extend the repre- 
sentation theorem to cover more general sets of singularities. Indeed 
Mittag—LefHler’s progress with the problem from 1880 on mirrors almost 
exactly the various successive stages of generalisation which Cantor’s 
theory passed through. 

Mittag—Leffler’s first step (in his [1882]) was to use Cantor’s 1872 
theory of first-species sets to generalise Weierstrass’s 1880 result. This was 
a natural step to take since Weierstrass’s result concerned a particular 
kind of first-species set, namely those P for which P® is finite. Mittag- 
Leffler? wanted to extend this to any P for which P™ is finite for some 


1 In his [1903] Russell remarks: 

Popularly speaking, the first derivative consists of all points in whose neighbourhood 
an infinite number of terms of the collection are heaped up; and subsequent 
derivatives give, as it were, different degrees of concentration in any neighbourhood 
(p. 324). 

He goes on: 
Thus, it is easy to see why derivatives are relevant to continuity; to be continuous, 
a collection must be as concentrated as possible in every neighbourhood containing 
any terms of the collection (#hid.). 

This gives some idea of why Cantor was using derivatives to study continua. 


2 In this respect Mittag-Leffler was following the path marked out by Cantor in 1872 in 
his work on trigonometric series, and later followed by Dini in his work on integration. 
See Hawkins [1970], chapters 2 and 3. The extension is quite natural. For in many 
respects infinite sets with only finitely many limit points behave like finite sets, particularly 
with regard to their concentration in intervals, Likewise, infinite sets which have only 
a finite number of nth order limit points are also good generalisations of finite sets, for, 
like finite sets, they are very thinly distributed (e.g. they are nowhere dense). 


B 
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natural 2.1 But as Mittag-Leffler acknowledged (loc. ctt., p. 941) this was 

surely not the most general theorem possible. Cantor had claimed in his 

[1880] that there are many second-species sets which are countable. So 

presumably if one could understand more about the way the derivatives of 

such sets behave then surely it must be possible to solve the representation 
problem for some of these. This, however, depended on Cantor’s theory of 
classification of second species sets and this was not yet very far advanced, 

at least in its published state. The breakthrough came in Cantor’s [18834] 

and [18836] with the new theory of transfinite numbers and transfinite 

derivation. It was this theory which Mittag-Leffler took up and used in 

his [18844]. 

Mittag-Leffler’s first fundamental result in his [18844] was the theorem: 
(2) Let Q be an isolated point-set (t.e. Q N QW = @). Then it is possible 

to construct an analytic expression F(z} such that the singularities of 
F are just the points of Ọ U OM? 

The proof of (2) depended crucially on Cantor’s [1883a] result that every 
isolated set is countable (loc. cit., p. 158). (The proof uses Cantor’s [1882] 
theorem that any infinite set of n-dimensional spheres or intervals which 
have at most boundary points in common must be countable.) Given this, 
Mittag-Leffler could assume that Q = {a,,..., a, ...} and that there is 
a countable sequence g, of rational or transcendental functions each 
having just the singularity a,. (This follows because, by definition, each 
point of QO can be surrounded by a disc of regularity.) This done, the 
analytic expression F(z) is constructed in a way similar to the construction 
of F(z) in (1).3 Then, since F(z) has singularities at a,,..., am ... and 
any limit of singularities is itself a singularity, then F(z) has singularities 
at all points of Q U O®. 

But now, as Cantor had observed, for any P, PX—P®, P®—P® and 
so on must be isolated; indeed P”— P“+» must be isolated for any number 
or ‘symbol of infinity’ «. Moreover, if PY ¢ P, as must be the case if P 
is the set of singularities of a function, then P—P™ is isolated as well.4 
This combined with (2) opened the way for Mittag—Leffler’s second 
fundamental result. 

1 See Mittag—Leffler [1882], p. 938-40. 

4 See loc. cit. pp. 8 and 2273. The definition of an isolated set as one which does not intersect 
its first derivative is given in Cantor [18834]. 

3 See, e.g. Osgood [1928], pp. 569-74. (2) is sometimes referred to as the ‘Generalised 
Mittag-Leffier Theorem’. It is interesting to note that not only does the proof depend 
on Cantor’s theorem that every isolated set ts countable, but also on the proof of the latter, 

4 These results of Cantor not only enabled Mittag-Leffler to prove and extend (2), but 
they also served to explain his previous theorems, For example, in his [1882] in dealing 
with first-species sets Mittag-Leffler had implicitly assumed that each of P—P™, 


PH— Paw, ...,PH-0—P™ (P being of the nth kind) is countable (see loc. cit., pp. 939- 
40). Cantor’s [18832] justifies this assumption. 


Towards a Theory of Mathematical Research Programmes (I) 19 


a ) Let F’ be a single-valued analytic function whose singularities form 
a set P. Assume that OQ = P—P® # Ø, then we can form an 
analytic expression F such that 


F'(e) = F(2)+-G,(2) 
where the singularities of F are just the points of Q U O™ and G; is 


a single-valued analytic function whose singularities form a set P, 
with P, S P®. (i.e. G; is regular in Q).+ 


F is constructed as in the proof of (2), since Q must be an isolated set. 
The importance of (2’) is that it shows that the representation problem 
for F’(z) in (2’) can be reduced to that for G,(z) which has a simpler set 
of singularities, namely P, € P™, But given that P™ can be broken 
down into (PY—P™) U P® where PY— P® is an isolated set there is the 

possibility of applying theorem (2’) to G. Thus, if PP—P™ ¥# Ø, F'(x) 

can be represented by F(z), some F(z) constructed for G,(z) and some 

new G(z) whose singularities are made up from the much simpler set P®, 

But this process can be carried on indefinitely. Indeed it can be iterated up 

to any ordinal number from Cantor’s first or second number-classes (t.e. 

any finite or countable ordinal), giving a countable sequence of analytic 

expressions F,(z).* 

Thus Mittag-Leffler proved: 

(3) Let F’(s) be a single-valued analytic function whose singularities 
form the set P. Then if « is a finite or countable ordinal, there is an 
analytic expression F(z), and a single-valued analytic function G(z) 
whose singularities all lie in P®, such that 

F'(2) = F(2)+G(2)8 

The F(z) in this case will be constructed from the countably many F(a) 

constructed at each stage of the process. 

This result (3) combined with Cantor’s fundamental decomposition 
theorems showed that Weierstrass’s problem can be completely solved in 


a large class of cases. For, it follows from Cantor’s results using his new 
theory that 


(4) If Pis a closed set (i.e. if P® © P) then there is always a number « 
of the first or second number-class such that P = P+D, 
(4) breaks down into the following two cases: 

(5) If P is closed and countable then there is a number « from the first 
or second number-classes such that P® = P@+D = gj; 


1 See Mittag-Leffler [18842], p. 29. t See Mittag-Leffler [18844], p. 57. 
3 See bid., pp. 57-72. 
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and 


(6) If P is closed and uncountable, then there is a number « from the 
first or second number-classes such that P® = P@+D 4 Ø, Le. 
such that P™ is perfect. 


(3) and (5) combined completely solve the representation problem where 
only countable sets of singularities are involved, no matter how the limit 
points of this countable set are distributed. And (3) and (6) combined reduce 
the problem where uncountable sets of singularities are involved to that 
involving only perfect sets of singularities. 


The development of Mittag-Leffler’s work is a beautiful example of 
one theory feeding off the results of another. And the novel consequences 
of Cantor’s 1880-3 theory in particular enabled Mittag-Leffler to take a 
substantial step towards a complete resolution of the representation 
problem. The ‘importance’ of this problem is not in any doubt; it was 
clearly not a problem created by Cantor’s new theory, and Weierstrass 
himself had marked it out as a fundamental problem for his programme 
in 1876. Lastly, Mittag-Leffler’s theorem played no part in the con- 
struction of Cantor’s theory; rather (5), the crucial contribution from 
Cantor’s theory, was a developed consequence of the new theory, not “built 
into’ it. 

Thus according to Hilbert’s criterion, Cantor’s introduction of trans- 
finite numbers into point-set theory clearly represents progress over 
Cantor’s pre-1880 point set theory. Thus, we have one example of how 
Hulbert’s criterion can be applied in appraising theoretical change. 

I now want to turn to another application, at first sight rather a curious 
one, of Cantor’s theory of transfinite numbers, namely Borel’s 1895 proof 
of the Heine-Borel theorem. This is particularly interesting because it 
raises an important question about the use of problem solutions in apprais- 
ing mathematical theories. 


(c) Borel’s 1895 Proof of the Heine-Borel Theorem 
In his [1895], Borel uses the following lemma: 
. if an infinity of given partial intervals on a line is such that its total sum is 


1 See Cantor [1884]. (5) follows from the fact that any closed P can be represented as 
P = (P—PM)U WU dla le U pia) 


(where Q is the first ordinal of the third pee and the fact, proved in Cantor 
[18836], that the second-number class is uncountable. Hence, if P is countable, there 
must be a first ordinal 8 < Q such that P'S) = (À, otherwise there would be uncountably 
many non-empty, disjoint sets on the right-hand side, thus making P uncountable. 
(6) follows from Cantor’s proof ([1884], pp. 465-6) that P'® and hence some P'a 
for & < Q must be perfect (ibid., p. 467). 
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less than that of a given total interval, then there is at least one point of the interval 
not belonging to any of the partial intervals.1 


It is clear from the context that ‘infinity’, ‘partial interval’, and ‘total 
interval’ are to be understood as “countable infinity’, “open interval’ and 
‘closed interval’ respectively. Thus stated contrapositively the lemma reads: 


(7) Ifa countably infinite collection of open intervals completely covers 
a closed interval, then its total sum must be at least as great as the 
length of the closed interval. 


The problem in proving (7) is to assign some value to the open cover, or 
at least to put a lower bound on it. This is precisely what the Heine—Borel 
theorem would do. Borel deals with this in a note at the end of his paper. 
He claims that although the lemma is ‘almost obvious’, it is nevertheless 
worth giving a proof based on a theorem ‘interesting in itself”. This he 
states as: 

Given an infinity of partial intervals on a line segment such that each point of 
the line lies in the interior of at least one of the intervals, then one can effectively 
determine a limited number of intervals chosen among the given intervals and 
having the same property (every point of the line is interior to at least one of 
them).? 

This formulation is not quite that of the Heine-Borel theorem. But what 
Borel actually sets out to prove is: 


(8) Any countable cover of a closed interval by open intervals can be 
reduced to a fimte cover. 


This is now just the Heine—Borel theorem for countable covers, and (7) 
follows easily from it.$ 

The proof which Borel gives for (8) uses Cantor’s transfinite ordinals 
(actually only the countable ordinals) as follows. 

Let [a, b] be the interval in question. Let (a,, 5,) be an interval which 
covers a, (đa ba) an interval which covers b4, (as, bs) an interval covering 
b, and so on. If b is reached after a finite number of steps, the theorem is 
proved. If not, we eventually reach (a,, be) which may or may not cover b. 
In any case b must eventually be covered by some (a,, b,) where « is a 
countable ordinal since otherwise an uncountable number of intervals 
would be required to cover [a, b], which is contrary to the statement of the 
theorem. 

Thus, [a, b] is covered by {(a,, b;): E < a-+1}. Once this is done the 
proof then shows, by induction on «, that this collection of intervals can 
1 The particular problem Borel was investigating is described in Hawkins [1970], pp. Aa 

102. Although Hawkins mentions the Heine—Borel theorem, he doesn’t discuss Bore 


proof. * Borel [1895], p. 281. 
3 (8) was christened the ‘Heine—Borel theorem’ by Schônflies in his [1899], p- 51. 
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be reduced to a finite collection. This is done in the following way. 

Assume as hypothesis that [a, b,] can be finitely covered for all £ < B < a 

then it is shown that [a, b,] can also be finitely covered. If B is a successor 

ordinal the result is immediate. If on the other hand £ is a limit ordinal, 
then since B is countable there must, by definition, be a countable sequence 
of increasing countable ordinals é, ..., m ... which has £ as a limit. 

Hence, b; ,b,,...,5,,... must be a sequence of points in [a, b;] tending 

to b; from the left. Hence, since (ag+1, 05+1) includes 5, in its interior, it 

must also include all but a finite number of the b, . Let bam, be the sequence 
point with least index in (@g+, bg+1). By hypothesis [a, 5, | can be finitely 
covered, since m, < $. Hence [a, bp] can be finitely covered simply by 
taking the finite cover of [a, b, d and (@g3.1, 634). This completes the proof. 

It seems clear that this use of Cantor’s theory again demonstrates its 
merit according to the conditions of Hilbert’s criterion. But the interesting 
point here is the following. The Heine-Borel theorem was proved very 
soon after 1895 in point-set analysis alone without any use of Cantor’s 
theory of transfinite numbers. It might then be wondered whether Borel’s 
use of the transfinite numbers was not just something of an oddity, and 
thus whether it is correct to regard this use as an indication of the merit 
of Cantor’s theory. This raises a more general question concerning the use 
of problem solutions in appraising theories. If a theory T is used to solve 
problems in a theory T* and it is later shown that the problems can be 
solved in T* alone, can we regard the original solutions using T as an 
indication of the merit of T? 

It seems to me that we can and should. Any general theory 7* has 
potentially infinitely many consequences. But these consequences have to 
be discovered. Thus, if a theory T plays a part in the discovery of con- 
sequences of T* then this should surely be allowed to count in T’s favour 
even if T is subsequently shown to be strictly unnecessary for deriving 
the statement in question from 7*. After all, in looking at the history of 
mathematics we are not concerned with whether a theory is in the long 
run logically indispensable, but rather with the impact it had on the state 
of mathematical knowledge in the period after its proposal. Theories which 
are heuristically important may turn out to be logically dispensable. But 
the historian and methodologist of mathematics must be prepared to give 
a theory credit for the heuristic role it played.? 

1 See Borel [1895], pp. 280-2, and Schénflies [1899], pp. 51-2. Borel actually gives very 
few of the details and even Schinflies doesn’t give them all. For example, he only proves 
the induction step from finite numbers to w, and not in complete generality. 

* As John Worrall pointed out to me the situation here is analogous to that regarding 


physical theories. If a theory T predicts a novel fact e then according to MSRP this fact 
supports T, and the support is not destroyed by the subsequent derivation of e from a 
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It is often not difficult to see just what the heuristic role played by the 
discovering theory is. In many cases, for example, the theory T might 
suggest how to look for a proof in T* alone. This is just what happened in 
the case of the Heine~—Borel theorem. 

The standard proofs of the Heine—Borel theorem rely on the connection 
between either the Bolzano—Weterstrass property and compactness, or 
sequential compactness and compactness. (Indeed it is now known that for 
metric-spaces these three properties are equivalent.) But it was the original 
transfinite numbers proof which first established that connection. What 
the original proof shows, in effect, is that the Bolzano—Weierstrass theorem 
employed a transfinite (but countable) number of times entails the Heine- 
Borel property. 

Take as an example the first stage of the proof. Define the intervals 
(an bn) as above (p. 21), and assume that {(a,, b,):7 = 1, ... m} fails to 
covet [a, b] for each m. Then b4, bg, . . ., bws . . . will be a sequence of points 
of the space [a, b], and thus (by the Bolzano—Weierstrass property of 
sequential compactness) will have a limit point p in [a, b]. This is one 
application of the Bolzano—Weierstrass theorem. By the hypothesis of the 
theorem there is an (a,, ba) which includes p; and since p is the limit of 
the bns (a, Co) will include all the 6, from some #, on. From this it is 
easy to show that [a, b,] can be finitely covered. 

The construction of the set {(as, b): E < a+1} (above, p. 21) and the 
fact that it can be reduced to a finite sub-cover now follows from countably 
many repetitions of the same argument, t.e., in particular, countably many 
repetitions of the Bolzano—Weierstrass theorem. 

Once a proof has established a connection between two statements or 
properties, one might then wish to proceed to investigate the connection, 
by seeking to discover, for example, whether the connection can be made 
more simply, or perhaps by a different route altogether. In the case of the 
Heine-Borel theorem, Borel and others (particularly Lebesgue) investigated 
the established connection between the Bolzano~Weierstrass property and 
compactness. The two most ‘obvious’ paths to follow are first to investigate 
whether the Heine-Borel property follows from fewer applications of the 
Bolzano—Weierstrass theorem; or second, whether even the proof of this 
theorem can be adapted to establish the Heine~Borel property. Borel took 
the second path, and Lebesgue the first. Thus Borel in his [1898] succeeded 
in showing that the standard proof of the Bolzano—Weierstrass theorem 


different theory T*. This applies even if 7* is T’s predecessor, and e is also novel with 

respect to 7%. The deduction of e from T* will certainly show that T is logically redundant 

as regards e. But the methodology nonetheless insists that the introduction of T con- 

stitutes progress. Intuitively, this is correct since T has led to an increase in our know- 
+ ledge. Its logical dispensability is irrelevant. 
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can be adapted so as to prove the Heine—Borel theorem directly. His proof 
runs as follows. 

Let [a, 5] be given, and assume we are dealing with a cover of [a, b] by 
open intervals which cannot be reduced to a finite subcover. (Borel still 
assumes the cover is countable, but this is not necessary.) Divide [a, b] 
into [a, (b+-a)/2] and [(b+-a)/2, b]. Clearly for at least one of these intervals 
the cover cannot be reduced to a finite sub-cover. Denote this interval (and 
if the same applies to both take the left hand one) by A,. Now similarly 
divide A, into two closed intervals, and so on. In this way we get a descend- 
ing sequence of closed intervals 4, 2 A, 2 A,... DA, 2... whose 
diameters tend to zero. Up to this point the proof is just a repetition of the 
standard Bolzano—Weierstrass argument. Continuing that argument, there 
will be a point p lying inside all the A,, which will be the limit of a con- 
vergent sub-sequence of end-points of the 4,,. Now lastly, let (c, d) be an 
open interval from the cover cf [a, b] which includes p in its interior. 
Since, the diameter of the A,, decrease to zero there will be an #, such 
that Ám € (c, d) for all m > n). Thus, all these A,, are finitely coverable 
which gives us a contradiction. Hence, every cover is reducible to a finite 
subcover.} 

The proof is almost an exact replica of the Bolzano—Weierstrass proof. 
Lebesgue’s procedure was even simpler. In his [1904] he argued as follows. 

Let À be the set of all x in [a, b] such that [a, x] has a finite sub-covering. 
Assume 6 ¢ X. Then there is either a first point not in X, or a last point in 
X. In either case let x, be this point. In fact x, will be the supremum of X. 
Now there 1s an interval (c, d) from the cover which includes x,, since 
a << x <b. Form (c, x) and (xp d) and let x. e(c, x) and x, € (x, d). 
Then a < x, < x, < x, < band so x, € X. Thus, there are finitely many 
intervals covering [a, x.]. Add (c, d) to these and we have finitely many 
intervals covering [a, xg], which is a contradiction since x, < a Thus, 
b e X, and [a, b] can be finitely covered.” 

In this case the theorem is derived from just one application of the 
Bolzano-Weierstrass theorem or the equivalent sequential compactness 
property.® 

The heuristic importance of the original transfinite numbers proof is 
now clear; their use in the proof of the Heine—Borel theorem is not merely 
an oddity. Thus, while Borel for example might have been led to find a 


1 See Borel [1898], pp. 43-4- 

2 See Lebesgue [1904], pp. 104-5. Proofs very similar to this are given in Kelley [1955], 
pp. 144~5, and Hocking and Young [1961], p. 18. 

3 This is required to prove the existence of x, assuming that we are dealing with real 
numbers given by Weierstrass’s or Cantor’s definitions. See Courant [1937], pp. 61-3. 
If we use Dedekind’s theory, then the Dedekind cut theorem has to be used. 
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new proof because of his philosophical objections to the transfinite numbers 
and the actual infinite, it is certain that heuristically he was very much 
indebted to them. Indeed Borel admitted as much himself in some general 
remarks he made in his [1899]. Pointing out that he does not make a great 
deal of direct use of Cantor’s theories in his paper he says: 

I hope that this exposition, which is set out in such a way that Cantor’s name is 
hardly mentioned, will not prevent anyone from realising how great is the 


importance of his works and the influence that the reading of them has had on 
the development of the ideas which I try to set out here.t 

Then in a footnote he goes on: 

It is all the more important for me to say this since the fact that I did not point it 
out in a recent book of mine, together with the fact that I did not use Cantor’s 
methods there, has led certain readers to believe that I do not place a high value 


on his work. The merit of the discoverer subsists even tf, for one reason or another, 
the method which he followed to achieve his aim is abandoned.* 


(To be continued.) 


London School of Economics 


1 Borel [1899], p. 136. 
3 Thid., n. 1; my italics. Cf. Schénflies [ 1908], p. 76, n. r. 
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Some Reflections on Quantum Logic and 
Schrédinger’s Cat 


by JEFFREY BUB 


The claim that logic is empirical’—advanced as a solution to certain 
conceptual problems of quantum mechanics—is a thesis about the way 
the world is put together. The proposal that the logic of the world is 
non-Boolean concerns the actual possibility structure of events. The 
question of evidential support for this thesis is another matter. The issue 
here is methodological. On what grounds could it possibly be rational to 
maintain such a thesis? 

I want to consider two groups of experimental phenomena and show 
that together they involve a dilemma. A certain assumption, which is 
apparently required as a component of any explanation of the first group 
of phenomena, is excluded by the second experiment. The Copenhagen 
interpretation grasps one horn of the dilemma, the quantum logical 
interpretation the other. The Schrödinger cat paradox is ee 
of the Copenhagen interpretation. 

The first experiment concerns polarisation of light and can be per- 
formed by the reader. Take two sheets of polaroid, place them one over 
the other in front of a light source, for example an open window, and 
observe that the intensity of the light waxes and wanes, from a maximum 
to zero, as one polaroid is rotated relative to the other through an angle 
of go°. 

This phenomenon is most easily explained by assuming that the system 
passing through the polaroid sheets is a (classical) wave, whose state of 
polarisation is induced by passage through a sheet of polaroid. It is a 
perfectly natural conception to think of the polaroid as a barrier trans- 

“forming the oscillatory motion of the wave as it passes through. The 
reason a wave theory is unsatisfactory is that the system manifests itself 
as a particle when it hits a photographic plate, t.e., the energy of the wave 
manifests a “quantum’’ behaviour—at sufficiently low intensities one 
observes individual hits on the photographic plate which build up over 
time to an intensity proportional to that of the wave. To see this, of 
1 The slogan originates with Hilary Putnam’s provocative paper “Is Logic Empirical?” 


Putnam [1968]. For recent debate on the issue, see Suppes [1976]. For reference to the 
‘classical’ papers on quantum logic, see Hooker [1975]. 
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course, you would have to substitute a photographic plate for your eye, 
and reduce the intensity of the light. 

As a particle explanation of the phenomenon, consider the following 
hypothesis, 

H: (i) A polaroid is a measuring instrument for a directional property 
P,—‘polarisation in the direction 6’—in the sense that it acts as a selection 
device or filter for microsystems with this property. (#) Each micro- 
system, 1.e., each photon, is characterised by a set of properties P,, such 
that for each property P,, the photon either has the property P, or lacks 
the property P,. More precisely, at each moment of time each photon 
is characterised by a state represented by a map which assigns the value 1 
or o to each possible property, so that the photon has the property P, if 
and only if it lacks the property Po.,/.. This map is extended homo- 
morphically to properties such as ‘P, and Py, “Po, or P,”, ‘not-P,’, etc., 
in the usual way, so that the set of properties assigned the value 1 is an 
ultrafilter in the Boolean algebra of possible properties, generated by 
conjunction, disjunction and negation on the properties P,. (tit) The fact 
that rotation of a single polaroid does not affect the intensity of the light 
passing through it (a uniform reduction in intensity over all angles is 
observed) may be explained by assuming an initial uniform distribution 
over all properties P, in the photon ensemble issuing from the light 
source. After passage through the first polaroid, with its ‘axis of polarisa- 
tion’ at an angle #, the ensemble of photons is such that every single 
photon has the property P and lacks the property P,.,/., whatever the 
distribution over other properties corresponding to other values of 0 may be. 

Now, this hypothesis is immediately excluded by the following pheno- 
menon: place a third sheet of polaroid between the first two, when the 
first two are at the position corresponding to zero intensity (z.e., relative 
angle 7/2). As the middle polaroid is rotated, keeping the first two at the 
initial position, the intensity changes from zero to maximum (although the 
maximum intensity with three polaroids is less than the maximum in- 
tensity with two polaroids). The important point is that the middle 
polaroid appears to select something from nothing, and this excludes the 
hypothesis under consideration. 

Placing the third polaroid in front of the original two when they are 
at the position corresponding to zero intensity, or behind them, does 
not alter the intensity. It is only in the middle position that the third 
polaroid appears to create photons out of nothing. 

Again, a classical wave theory explains these phenomena adequately, 
but since we are considering a particle explanation (for the reasons out- 
lined above), the following modification of H suggests itself: 
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H’: As in H, cach photon is characterised by a set of directional proper- 
ties P,. A polaroid with its axis of polarisation in the direction % is a 
selection device for the property P,, but in selecting those photons with 
the property Py, the polaroid disturbs the systems in such a way that the 
properties P (04% or %-+-7/2) may be altered. 

The hypothesis H’ involves the supposition that the polaroid is not 
merely a measuring instrument which selects for a certain property, but 
a device which disturbs the systems it passes. H’ preserves a feature of 
H: at each moment of time the system is characterised by a 2-valued 
map which assigns a x or a o to each property P, (for all 0) in a consistent 
way, t.e., such that the set of properties assigned the value 1 forms a 
Boolean ultrafilter in the Boolean possibility structure of properties 
generated by all the properties P,. According to H’, the polaroid is a 
measuring instrument in so far as it selects for a property of photons. 
But in selecting or filtering those photons with a certain property, the 
polaroid disturbs the systems it passes in an essential way, so that other 
photon properties are altered. This hypothesis explains the phenomenon 
(something rather than nothing, when a third polaroid sheet is placed at 
certain angles between the first two) because, as a result of the disturbance, 
the ensemble leaving the middle polaroid is no longer uniformly character- 
ised by the property P,, and hence a certain fraction of the ensemble 
will pass the second polariser, with axis of polarisation at %-+-7/2. 

A careful measurement of the variations in intensity with change of 
angle (using two sheets of polaroid) will reveal that the resulting intensity 
is simply proportional to the initial intensity by a factor equal to the 
square of the cosine of the relative angle of polarisation. Further specifica- 
tion of the measurement disturbance can accommodate this fact. 

Suppose we characterise the photon after passage through the first 
#-polaroid by a parameter #, representing the property selected, and 
introduce a ‘hidden variable’ Ae A so that the pair (7, À) defines an 
appropriate 2-valued map, &,(, A), on the properties P,. This map will 
have to satisfy the condition that A,(4, À) = 1, for all A. Then the measure- 
ment disturbance may be characterised as a disturbance of the system in 
such a way that the ensemble of photons selected is uniformly distributed 
over all values of À, £.e., we assume that the polaroid disturbs the systems 
it passes in a random way. Specifically, we may assume that a photon 
ensemble selected and disturbed by a polaroid with axis in direction # 
is represented by a probability measure o{#)Xp(A) over the measure 
space ¥ x A, where o( y) is an atomic measure concentrated at the point # 
and p(A) is uniform over A (irrespective of the probability measure 
represefiting the initial ensemble). It is easy to choose an appropriate 
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measure space and define a map A,(#, À), so that a fraction cos? (ys—6) of 
an ensemble passed by a y-polaroid has the property P, and hence will 
pass a 9-polaroid, in agreement with observation. 

H’ represents the ‘hidden variable’ interpretation of quantum mechanics. 
Historically, the conceptual problems of micro-physics were introduced 
by the dual ‘wave-particle’ nature of microsystems such as photons and 
electrons. The proposal to extend the quantum mechanical description 
of a microsystem by additional hidden variables is, in effect, to regard 
the microsystem as a particle whose apparent wave-like properties (as 
exhibited, for example, by the phenomenon of polarisation) reflect the 
disturbance by the measuring instrument (in this case, the polaroid). 

To sum up: the first experiment—the phenomenon of polarisation 
excludes hypothesis H, in which each system has all its properties and the 
measuring instruments merely select for specific properties without 
disturbance. When I say ‘each system has all its properties’, I mean that 
for each property corresponding to a measuring instrument (and there is 
a measuring instrument for each angle 6), the system either has the 
property or it does not, in the Boolean sense outlined above. The only 
empirically adequate hypothesis involves the supposition that the measur- 
ing instruments do not merely select but also disturb. 

The second experiment, which together with the first leads to a dilemma, 
is the Einstein—Podolsky—Rosen correlation experiment. I want to show 
that the possible correlations between isolated systems exclude the 
disturbance hypothesis, H”. 

Briefly, the situation is this: it is possible to have two systems of the sort 
considered in the first experiment which are created together in an initial 
interaction and subsequently move apart. As a result of the interaction, 
the two systems are correlated in the following way: the first system, S, 
appears to be a ‘mirror-image’ of the system S’, t.e., S passes a w-filter if 
and only if S’ is blocked by a #-filter, for any angle #. That is, apparently 
S has the property P, if and only if .S’ lacks the property P,, and this 
holds for every property P,, since we may choose to test for any property, 
and the correlation is maintained. Moreover, this correlation only exists 
for the first measurement at S or S’. It is not the case that S passes two 
filters, a w-filter and a q@-filter, if and only if S’ does not pass through 
similar filters. In fact, the observed intensities are as if the filters disturb 
the systems they pass. 

The correlations at the first stage of the experiment—when only the 
correlations between a single spin measurement at S and a corresponding 
single spin measurement at S’ are considered—suggest a hypothesis 


1 For details, see Bub [1976], pp. 517, 518. 
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like H, t.e., the system S is characterised by a set of properties, such that 
for each property P,, either P, belongs to the set or, if not, the property 
Po4,/2 belongs to the set, and S’ is characterised by precisely those pro- 
perties which do not belong to S. At first sight, H’ seems to explain both 
the polarisation experiment and the correlations of the Einstein—Podolsky— 
Rosen experiment. The correlations at the first stage of the Einstein- 
Podolsky—Rosen experiment are explained by assuming that S and SS’ 
have the correlated properties before measurement. The disturbance 
assumption which explains the polarisation phenomena also explains why 
the correlations disappear after the first measurement at S (or S’). 

The success of H’ is only apparent, however, as the more detailed 
examination of the statistical correlations by J. S. Bell reveals.1 The 
probability of S passing a -filter is 4, for any y. Similarly, for S’. The 
_ probability of S passing a filter and S” failing to pass a o-filter is equal 
to half the probability that S, having passed a @-filter, will pass a 4-filter. 
It can be shown that the hypothesis H’, together with the assumption of 
mirror-image correlations over all the properties for the two systems S 
and SS’, implies the following: for any measure on the measure space of 
the composite system S-+-S’, the measure of the set of points for which S 
has the property Py and S” fails to have the property P,, is equal to the 
measure of the set of points for which S has the properties P, and P,.* 
In other words, on the assumption H’, the probability that S, having 
passed a q-filter, will pass a #-filter, is equal to the measure of the set 
of points for which S has both the properties P, and P,, divided by the 
measure of the set of points for which S has the property P, (1.e., 4, for the 
initial measure appropriate to the experiment we are considering here). 
This means that the probability in question is equal to the classical (z.e., 
Boolean) conditional probability, and this is characteristic of a filtering 
process that merely selects for the properties P, and P, without any dis- 
turbance. But the hypothesis H’ requires that passage through a filter, in 
addition to selecting for a particular property, disturbs the system in such 
a way that an initial measure over the measure space ¥ x A is transformed 
to the measure o{#) x p(A) (with o(¥) concentrated at the point ÿ e Y and 
p(A) uniform over 4)—which represents a classical conditionalisation and 
randomization of the initial measure. 

Evidently, the assumption (call it P) that each system has all its pro- 
perties in the Boolean sense clarified above, is excluded by the polarisation 
experiment, unless modified by the disturbance assumption (call it D), 
which is excluded by the correlations of the Einstein—Podolsky—Rosen 
experiment. If we attempt to develop a theory of micro-physics which 


1 See Bell [1964]. * See Bub [1976]. My reconstruction. 
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incorporates the principle P, we have a problem in explaining the pheno- 
menon of polarisation, a typically wave effect, f.e., a phenomenon as- 
sociated with a system that is not localised. If we attempt to develop a 
theory incorporating a disturbance principle D, which seems to be re- 
quired in order to explain the polarisation experiment, we have a problem 
in explaming the Einstein—Podolsky—Rosen correlations, which at least 
partly suggest the existence of localised systems S and S’ with correlations 
existing prior to measurement between the actual properties of the two 
systems. This is the dilemma. 

The Copenhagen interpretation attempts to maintain the disturbance 
principle D without P. I take the core feature of this position to be that 
a microsystem can only be said to have the property induced by the 
measuring instrument (and to lack the orthocomplement of this property).1 
Without specifying a measurement context, the attribution of properties | 
to the microsystem is regarded as meaningless. For example, the Copen- 
hagen interpretation associates with each photon a single directional 
property P, induced by the polaroid. A polaroid with its axis of polarisa- 
tion in the direction is not a selection device for photons with the 
property P,, t.e., it is not a measuring instrument for this property in the 
usual sense. Rather, the polaroid is regarded as capable of altering the 
incident photons in such a way that an initial uniform ensemble of photons 
with the property P, is transformed to a uniform ensemble of photons 
with the property P,. Thus, each micro-system is characterised by a 
(Boolean) ultrafilter of properties generated from a single property P, by 
conjunction, disjunction and negation. But there is an uncountable 
infinity of properties Ps, 0% or y-+7/2, such that the system neither 
has the property P, nor lacks the property P, (since it does not have the 
property P5,./2). In this case, it seems preferable to regard the P, as a set 
of possible states for the photon (‘polarisation states’). The polaroid is 
then not a measuring instrument which selects for a certain property of 
photons, but rather a device which induces a specific change in the 
polarisation state of any photon it passes. Specifically, we may assume 
that a polaroid with its axis of polarisation in the direction 4 induces a 
change of state on incident photons to P}, in such a way that the prob- 
ability of a photon in polarisation state P, passing a -polaroid is cos* 
(4—0). Putting this another way: we assume that the polarisation states 
are propensity states, representing the propensity for a photon to pass a 
polaroid with its axis of polarisation in a certain direction. The propensity 
for a photon in polarisation state P, to pass a polaroid with its axis in 


1] do not attempt to distinguish here between various versions of what might better be 
termed the ‘orthodox’ interpretation, t.e., the view originating with Bohr and Heisenberg. 
À close reading of Bohr will reveal a rather different thesis, but this is not to the point here. 
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direction ¢ is cos? (y;—@). This may be regarded as the propensity for the 
polaroid to induce a state transition on the incident photon from P, to 
P,, since after passing a W-polaroid, a P,-photon is transformed to a P,- 
photon, irrespective of the initial value of 0. 

Physicists are fond of this apparent solution to the problem. But the 
notion of a ‘propensity state’ surely requires clarification. A propensity 
is not merely a disposttion to behave in a certain way under certain circum- 
stances. The polarisation state P, may well be understood as a dispositional 
property, representing the disposition for the photon to pass a polaroid 
with its axis in direction 0 or 0-2. A photon in the state P, has already 
made up its mind, as it were, whether or not it will pass such a polaroid, 
should it be asked to do so. For other angles however, the state P, deter- 
mines only a probabikty to pass the polaroid, i.e., the photon cannot be 
characterised as being disposed to pass a polaroid with its axis in direction 
d—it may well fail to pass such a polaroid. One way of clarifying the 
notion of a propensity state would be to analyse a propensity as‘a probabil- 
ity distribution over dispositional properties. On this analysis, the Copen- 
hagen interpretation would essentially reduce to the hidden variable 
hypothesis, H’.+ 

The Copenhagen interpretation originates with the attempt to have a 
particle which behaves like a wave. Just as a polarised wave is characterised 
by a single polarisation state, the micro-system is assigned a single property 
P, out of the set of possible properties P, (for all @). Just as the polaroid 
induces a characteristic change of polarisation and reduced intensity on 
a wave passing through it, so the polaroid induces a change of property on 
a certain fraction of an ensemble of microsystems passing through it 
(the rest being absorbed). While the function of the polaroid is entirely 
clear on a pure wave theory, it is unexplained on the Copenhagen inter- 
pretation unless something more is said. In the Einstein-Podolsky- 
Rosen experiment, the fact that any alteration in the experimental set-up 
at S automatically alters the probabilities of events at S’, is explained 
on the basis that S, S’, and the experimental context at S and S’, are to 
be treated as an indissoluble whole. It is not that properties of micro- 
systems are disturbed by the action of measuring instruments or properties 
created in measurement. Rather, the very attribution of properties to 
microsystems is to be regarded as an abstraction. For a microsystem, to be 
is to be disturbed in a certain context. 

In this sense, the Copenhagen interpretation might be said to be ‘soft 
on properties’. The photon, recall, is regarded as only possessing the 


4 Such an analysis is provided by the Bohm-Bub hidden variable theory. See Bohm and 
Bub [1966] and Bub [1968]. 
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single property P, induced by the polaroid (and only lacking the ortho- 
gonal property P,:,12). Now, it is all very well to be soft on properties 
as far as microsystems are concerned—they are so very tiny, after all— 
but it is a little difficult to be soft on properties for macrosystems. The 
property of being alive for a macrosystem such as a cat, for example, is 
surely not induced by the measuring instruments with which we determine 
whether or not the cat is alive or dead (in this case, our eyes will do). 
Now, alternative properties P,, not -P, of a microsystem which is not in 
the state P, or P,,,/. may be coupled to the macroscopically distinguish- 
able alternative of life or death for a cat, as Schrôdinger pointed out. It 
follows that you cannot be soft on micro-properties without also being 
soft on macro-properties. And this seems to be unacceptable. 

Schrödinger? considers a cat in a closed box containing a small amount 
of radio-active material. The probability that at least one atom will decay 
within an hour is 4. If a decay occurs, a Geiger counter is activated, 
closing a circuit which electrocutes the cat. Instead, we might consider 
shooting a photon in the polarisation state P, where cos* (@—y) = $, 
through a hole in the box towards a polaroid with its axis of polarisation 
in the direction %. If the photon passes the polaroid, it is detected by a 
device which kills the cat. Otherwise, the photon is absorbed by the 
polaroid and the cat is reprieved. 

Quantum mechanics assigns the composite system (photon-++photon 
detector and cat) a state represented by a vector in the Hilbert space of the 
composite system, which is a linear superposition of two product states, 
with coefficients 1/4/2, corresponding respectively to the polarisation 
state P, for the photon and the cat dead, and the polarisation state Py,,/2 
and the cat alive. According to the Copenhagen interpretation, this means 
that the composite system possesses the atomic property P assigned 
probability 1 by the composite system state, and lacks the orthocomple- 
ment of this property, P+. But neither the property P nor the property 
P+ corresponds to either of the properties of the composite system, (P,, 
dead) or (P,,,», alive). This follows because the property (P,, dead) 
corresponds to the first component of the superposition, and the property 
(Piz alive) to the second component of the superposition. Each of 
these composite properties has probability 4, but since they are both 
incompatible® with the property P possessed by the composite system 
1 See Schrödinger [1935]. 

* P is the property represented, in the Hilbert space of the composite system, by the 


projection operator onto the 1-dimensional subspace spanned by the state vector of 
the 


system. 

8 In the technical sense of quantum mechanics. In the polarisation example, the pro- 
perties Pa, with 04% or $-+-x/2, are all incompatible with the property Py, as well as 
with the property Py+r/2. 
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(as well as with the property P+), the system neither has the property 
(Ps dead) nor lacks the property (Pp, dead). Similarly, the composite 
system neither possesses the property (Py.,,, alive), nor lacks this pro- 
perty. Thus, the cat cannot be said to be either alive or dead. In fact, 
since the state of the composite system is regarded as a ‘propensity state’, 
the probabilities of $ for the composite properties (P,, dead), (Pysa/2 
alive) are to be understood as representing the propensities for a suitable 
measurement to induce a state transition from the composite state to one 
of the components of the superposition representing these composite 
properties. In this case, a ‘suitable measurement’ is just the action of 
opening the box and looking at the cat. 

It seems to me that this problem or, alternatively formulated, the 
measurement problem, is symptomatic of the Copenhagen interpretation. 
The measurement problem is usually formulated as the problem of 
explaining the transition from a pure state to a mixture (or one of its 
components) in a measurement.’ But the presumption that the state of 
the composite system (object-+-measuring instrument) is represented by 
a mixture of product states, each product representing the object in a 
certain state and the measuring instrument in a correlated state, follows 
from the requirement that a macroscopic measuring instrument always 
exhibits a state corresponding to a definite ‘pointer reading’, and never 
a superposition of such states (t.e., a state in which it is neither true nor 
false that the pointer is at position n, for any value n of the possible 
pointer readings). Schrédinger’s cat merely dramatizes the problem. If 
a macrosystem is treated as in principle analysable as a large number of 
interacting microsystems, then the measurement problem is unavoidable 
on the Copenhagen interpretation. I do not see how this problem can be 
resolved within the framework of the Copenhagen interpretation, without 
adding to quantum mechanics a principle preventing the superposition 
of macroscopically distinguishable states of a macrosystem. 

The quantum logical interpretation attempts to maintain the general 
principle P without D by proposing a non-Boolean possibility structure 
for the properties of microsystems. It is possible to show that the apparent 
wave properties of a microsystem derive from specific features of the non- 
Boolean possibility structure. A generalised theory of conditional probabil- 
ity appropriate to the non-Boolean structure of events yields a modified 
form of von Neumann’s projection postulate first proposed by Liiders (and 
generally accepted as correct) as the non-Boolean rule for conditionalising 
1A pure state is represented by a vector in the Hilbert space of the system. A mixture 

corresponds to a probability distribution (in the usual sense) over pure states, and 


cannot be identified with any single pure state unless the weights of the components 
in the mixture are all roro. 
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a probability measure on new information, without disturbance.t The 
Liiders rule, and not von Neumann’s rule, reduces to the classical con- 
ditionalisation rule in the Boolean case (or when the algebra of random 
variables is commutative). Thus, the projection postulate, which was 
initially proposed by von Neumann as representing the disturbance of 
the system by the measuring instrument, appears on the quantum logical 
interpretation as a probability conditionalisation rule. ‘The correct con- 
ditional probabilities for sequential measurements in the polarisation 
experiment, as well as the Einstein—Podolsky—Rosen correlations, are 
derived from a probability analysis on the non-Boolean structure, using 
the Liiders rule to conditionalise probabilities, without a disturbance 
assumption. 

The difficulty with this view does not seem to lie with the non-Boolean 
probability theory, but with the semantic features of the non-Boolean 
logic. The Kochen and Specker theorem shows that the non-Boolean 
possibility structure of a quantum mechanical system is not in general 
imbeddable into a Boolean algebra, from which it follows that there are 
no 2-valued homomorphisms on the structure.* In the Boolean case, a 2- 
valued homomorphism on the algebra is a bivalent assignment of truth 
values to all propositions in the usual way, ñe., the 2-valued homo- 
morphisms represent ‘logically strongest consistent propositions’ or ‘atomic 
facts’, states of affairs in the maximal sense. 

The non-existence of 2-valued homomorphisms in the general case 
means that every assignment of values to the magnitudes (the value a to 
the magnitude A, b to B, c to C, etc.) must violate the algebraic structure, 
in the sense that there will exist a pair of magnitudes, M and N (actually 
an infinite set of such pairs), such that M = F(N), but m + f(n). If we 
demand preservation of structure in this sense, then there is no assignment 
of values to the magnitudes. 

It seems, therefore, that even on the assumption of a non-Boolean 
possibility structure for a quantum mechanical system, maintaining the 
following view faces formidable difficulties: 

(a) For any system S, at every time #, every magnitude has a value. 

(b) The values of the magnitudes of S preserve the characteristic 
algebraic structure of the magnitudes of a quantum mechanical system, 
1.e., if the value of M for S at time t is m, and the value of N for S at time 
t is n, and M = f(N), then m = f(n). 

(c) Ideal measurements merely reveal the values of these magnitudes, 


1 See Bub [1977], Lüders [1957]. The Lüders rule is discussed at some length in Furry 
[1966]. 

® See Kochen and Specker [1967]. This also follows as a corollary to Gleason’s Theorem. 
See Gleason [1957]. 
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m of M, n of N, etc., without disturbing S; t.e., the measurement of M, 
say, with result m on a statistical ensemble of systems is understood as the 
selection of those systems with the value m of M from the ensemble, 
without altering the value of M or the values of any other magnitudes. 

(d) The statistical states of quantum mechanics represent all possible 
probability distributions over the values of the magnitudes for S, and 
transformations of probability distributions on measurement are derived 
by conditionalising the initial probability distribution in accordance 
with the measurement result, where a measurement is regarded as in 
principle ideal (z.e., no additional disturbance transformation is invoked). 

The Schrödinger cat paradox is characteristic of any interpretation 
which says that a system S at time # only has some of its properties (in the 
sense that there exist properties P, such that neither P nor not-P belongs 
to S at time £). It seems that the paradox can only be resolved if a micro- 
system S, at any time ¢, is regarded as having all its properties (in the 
sense that for every property P, either P or not-P belongs to S at time t). 
But the assumption of a non-Boolean possibility structure for S does not 
immediately show that (a) can be maintained together with (b), (c), and 
(d), unless the sense of a magnitude ‘having a value’, or a proposition 
‘being true’, or a property ‘obtaining’, as well as the notion of ‘disturbance’ 
(or ‘change of property’, in general) is understood in some radically new 
way. If we drop requirement (a), then the quantum logical interpretation 
incorporates the essential feature of the Copenhagen interpretation, and 
cannot avoid the measurement problem. On this view (which seems to be 
held by Simon Kochen),' the system S at any time £ possesses an ultra- 
filter of properties in the non-Boolean possibility structure, £.e., the set 
of properties assigned probability 1 by the pure state or atomic property 
characterising S at time t. The Copenhagen interpretation perhaps 
restricts this set to an ultrafilter in a Boolean subalgebra of the non- 
Boolean logic, in which case all the properties characterising S at time t 
are mutually compatible. A Kochen ultrafilter contains mutually in- 
compatible properties as well (1e., elements in the non-Boolean logic 
which do not belong to a common Boolean subalgebra). The point of 
similarity is that in both cases there are propertics P such that both P and 
the orthocomplement of P lie outside the ultrafilter, and it is this feature 
that generates the measurement problem. Requirements (b), (c) and (d) 
cannot be dropped without reducing quantum logic to the status of an 
algebra of measurement operations, which must then be regarded as 
disturbing the actual values of the magnitudes. 


1 Lecture by Kochen and subsequent discussions at a Symposium on quantum mechanics 
held at the Minnesota Center for Philosophy of Science, June 1976. 
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Thus, the programme for the quantum logical interpretation must be to 
maintain (a), (b), (c) and (d) by a non-Boolean analysis of the semantic 
notion of a property ‘obtaining’. The paradox is resolved 1f every property 
either obtains or does not obtain, and if P obtains, then not-P does not 
obtain. So the cat is always either alive or dead (exclusively), even when 
the state of the composite system is represented by a superposition 
assigning probability 1 to a property of the composite system incompatible 
with either of the composite properties (P,, dead), (P,4,,2, alive). 

I have argued that it makes good methodological sense to consider a 
non-Boolean possibility structure for a quantum mechanical system, if 
(a), (b), (c) and (d) cannot all be preserved on the assumption that the 
possibility structure is Boolean, given that the only viable way of violating 
(a), (6), (c) and (d), while preserving Booleanity, yields the Copenhagen 
interpretation and the associated measurement problem. The open prob- 
lem for the quantum logical interpretation—the other horn of the dilemma 
—is a non-classical theory of properties, 1.¢e., a theory of the ‘obtaining’ 
relation which does not require the existence of a 2-valued homomorphism 
assigning I to every property which obtains. Such a theory might be 
expected to yield a theory of sets without points, in a sense analogous 
to von Neumann’s generalisation of projective geometry to continuous 
geometry, which he conceived as a geometry without points.+ 


The University of Western Ontario 
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Scientific Explanation 


by JAMES WOODWARD 


In many philosophical discussions of scientific explanation examples of 
the following sort are taken as typical or paradigmatic.! 


(Ex. 1) All ravens are black. 
This is a raven. 
This is black. 


(Ex. 2) All iron expands when heated. 
This is iron which has been heated. 
This expands. 


(Ex. 3) All water boils when heated. 
This ig water which has been heated. 
This boils. 


Many people have been struck by the apparent differences between 
derivations like Ex. 1—Ex. 3 and the sorts of explanations one might most 
naturally regard as paradigms of good scientific explanation, explanations 
of the sort one finds in scientific treatises and textbooks. In this essay I 
shall attempt to isolate one such difference and to argue that it is of 
fundamental importance. I shall argue that there is a fundamental differ- 
ence in the kind of understanding provided by derivations like Ex. 1-3 
and the kind of understanding provided by a good scientific explanation, 
and that it is a defect in the standard, Hempelian version of the covering- 
law model that it is insensitive to this difference. I shall contend that this 
difference is sufficiently important to warrant one in saying that, in an 
important sense, derivations like Ex. 1-Ex. 3 are not scientific explanations 
at all, and that an acceptable scientific explanation must meet another 
necessary condition in addition to those standardly imposed by covering- 
law theorists. | 


Recetved 29 November 1977 


1 'The second and third examples are taken, with slight modifications, from Carnap 
[1967] and Bergmann [1957] respectively. 
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x I shall begin my discussion by describing in some detail several 
examples of explanations one might naturally describe as scientific. I shall 
then explore some of the similarities and differences between these 
explanations and explanations like (Ex. 1-3). 

Consider, first, an explanation (Æx. 4) of Galileo’s law in terms of 
Newton’s laws of motion and the law of gravitation. If we assume that 
the earth is a sphere and that the only force on a falling body is due to the 
earth’s gravitational attraction, we have from the above laws, 


where F is the force on the falling body, m its mass, h its height above the 
surface of the earth, M the mass of the earth, and R the radius of the 
earth. Since R > h, we can take (R+-A) = R. Then dividing through by 
m, we get 
a=G RE 

When we substitute numerical values for G, M and R, we obtain g, the 
actual acceleration of an object falling freely above the earth’s surface. 

The second explanation (Ex. 5) I want to consider is the standard 
explanation given in micro-economic theory for why a monopoly which 
takes over a formerly competitive industry will raise prices and restrict 
output. 


P; 
P 
2/ SMC 
È 
Lu 
AR 
SET MR Quantity 
Figure 1 


Consider the above diagram. When a monopoly takes over a formerly 
competitive industry, the demand curve of that industry becomes the 
demand curve, or the average revenue curve (that is the curve which 
gives price per unit at each level of output) for the monopolistic firm. 
This curve, which is labelled AR in the diagram above, will be downward 
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sloping—price will be inversely related to quantity of product sold. The 
curve labelled MR in the above diagram is the marginal revenue curve 
(that is, the curve which gives the change in total revenue occurring with 
each change in output) for the monopoly. Since the average revenue curve 
is downward sloping this curve will be more sharply downward sloping. 
The curve labelled SMC is the short-run marginal cost curve for the 
firm, that is the curve which indicates total change in cost for the monopoly 
per change in output. 

Now it is easy to see that if the monopoly maximises profits it will 
select that price (P) at which marginal revenue is equal to marginal cost. 
Suppose that the firm had selected price P,, at which marginal revenue 
exceeds marginal cost. In that case, if the firm were to lower its price 
so that the quantity of goods demanded would rise, revenue would 
increase at a faster rate than costs, so that profits would rise. Thus the 
profit maximising firm will lower its price toward P. Suppose on the 
other hand that the firm had selected price P,. At this price costs are 
increasing more quickly than revenue, and profits may be increased by 
increasing price until P is reached. 

Now contrast the behaviour of the monopoly with the behaviour of a 
price taking firm in a competitive industry, before it is taken over by the 
monopoly. Any such firm will sell goods at P,, at which price is equal to 
marginal cost. Such a firm by definition can sell any amount of its output 
at the going market price. Its average and marginal revenue curve are 
identical, and it cannot increase profits by restricting the quantity of 
goods it sells. 

Because marginal revenue is always less than average revenue for the 
monopolistic firm, marginal cost will always exceed marginal revenue 
at the price P, adopted by the competitive firm. Thus the monopoly 
will always be, in comparison with the price taking firm, a price raiser 
and output restrictor—it will raise prices from P, to P and restrict quantity 
of goods sold from X, to À. 

The third explanation (Ex. 6) I want to consider is an explanation, in 
terms of Coulomb’s law, of why the magnitude of the electric intensity 
(force per unit charge) at a perpendicular distance r from a very long fine 
wire have a positive charge uniformly distributed along its length, is given 
by the expression 

I À 
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(where A is the charge per unit length on the wire) and is at right angles 
to the wire. 
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We can think of the wire as divided into short segments of length dx, 
each of which may be treated as a point charge dg. The resultant intensity 
at any point will then be the vector sum of the fields set up by all these 
point charges. By Coulomb’s law, the element dg will set up a field of 
magnitude | 

ie + 


mo reme 


Ame, $ 





at a point P a distance s from the element. Integrating the x- and y- 
components of dE separately, we have 


E, = (dE, = {dE sin 0 
E, = (dE, = | dE cos 6. 
If À is the charge per unit length along the wire, we have dg = Adx, and 


1 dx 


dE == ÂTE = 





The integration will be simplified if we integrate with respect to d@ rather 
than dx. From the above figure 


x =r tanb and s=rsec? 








and thus, 
dx = r sec?0 dô. 
Making these substitutions, we obtain: 
À 
a : sin d0 
À 
: =. = [cos 6 dé. 
477 Eg r 
If we assume that the wire is infinitely long, the limits of integration will 
be from 6 = —7/2 to 8 = w/z. Integrating, we obtain 
Ei, == 0 
I À 
E, = = 


27€ 


Scientific Explanation 45 
This shows that the resultant field will be at right angles to the wire and that 
its intensity is given by 
I À 


2TEg T 

Let me begin my discussion of these three examples by noting that 
they do indeed conform (or very nearly conform) to the requirements 
laid down by the proponents of the covering-law model. That is to say, 
Ex. 4-6 are (or are very nearly) sound deductive arguments, arguments 
in which a law occurs as a premise which is required for the derivation of 
the explanandum. Moreover, Ex. 4-6 seem to meet the other conditions 
standardly imposed on scientific explanation by covering-law theorists. 

It is true that Ex. 4 and Ex. 6 involve approximations or idealisations 
which are not, strictly speaking, true, although they are very nearly true. 
What can be deduced from the laws occurring in Ex. 4 and Ex. 6 and 
other true premises are in fact only approximations of the explananda in 
Ex. 4 and Ex. 6. Ex. 4 and Ex. 6 show us why their explananda are very 
nearly true or why the relationships they describe hold to the extent they 
do. Nonetheless, it would, I think, be a mistake to attach any great signifi- 
cance to this fact. Certainly it seems to be a mistake to suppose, as some 
writers apparently have,? that satisfying or illuminating scientific ex- 
planations always involve approximate rather than strict derivations— 
Ex. 5 is an obvious counter-example and there are many others. It seems 
to me reasonable to take as our paradigmatic cases of scientific explanation 
those explanations which are sound deductive arguments and to treat 
explanations like Ex. 4 and Ex. 6 as acceptable because they approximate 
to this pattern. 

It also seems clear that, as covering-law theorists have contended, a 
law or set of laws figures essentially in each of the above derivations 
(in Ex. 4, we have Newton’s second law, and the law of gravitation; in 
Ex. 5 we have the law that all firms maximise profits; in Ex. 6 we have a 
version of Coulomb’s law). 

In what follows I shall assume that the covering-law model is correct 
as far as it goes; that is, that the covering-law model does state necessary 
conditions which any acceptable scientific explanation must meet. It 
is clear, however, that explanations (Ex. 1-3) as well as explanations 
(Ex. 4-6) meet all of the necessary conditions on scientific explanation so 
far discussed. If we wish to develop an account of scientific explanation 





1J have in mind here the additional requirements imposed by Hempel and Oppenheim 
in their [1948], supplemented by those imposed by Kim in his [1963], and/or require- 
ments *.1—7.5 imposed on singular explananda by Raimo Tuomela in chapter VII of 
his [19'731- 

* A position somewhat like this seems to be taken by Feyerabend in [1962]. 
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which is sensitive to the differences between explanations (Ex. 1-3) and 
explanations (Ex. 4-6), we must formulate a further necessary condition 
on scientific explanation. I suggest the following condition, which I call 
the requirement of functional interdependence 


(f) The law occurring in the explanans of a scientific explanation of 
some explanandum Æ must be stated in terms of variables or 
parameters variations in the values of which will permit the deriva- 
tion of other explananda which are appropriately different from E. 


I shall first illustrate very briefly how this condition applies to the above 
explanations, and I shall then attempt to clarify it and to motivate its 
adoption. | 

Consider the generalisations which figure in explanations (Ex. 4-6). 
These generalisations contain variables or parameters (mass, distance, 
acceleration, price, quantity of goods, charge, electrical intensity and so 
forth) which are such that a whole range of different states or conditions 
can be characterised in terms of variations of their values. The laws in 
Ex. 4-6 formulate a systematic relation between these variables. They 
show us how a range of different changes in certain of these variables 
will be linked to changes in others of these variables. In consequence, 
these generalisations are such that when the variables in them assume one 
set of values (when we make certain assumptions about boundary and 
initial conditions) the explananda in the above explanations are derivable, 
and when the variables in them assume other sets of values, a range of 
other explananda are derivable. For example, the second law of motion 
and the law of gravitation which occur in explanation (Ex. 4) are such that 
when the variables in them assume appropriate values (values for the 
mass and radius of the earth) Galileo’s law is derivable. But these generalis- 
ations are also such that when the variables in them assume different 
values (via the combination of these generalisations with a different set 
of initial or boundary conditions) quite different explananda are derivable. 


1A requirement like the requirement imposed here has been imposed as a requirement 
on the explanation of scientific laws by several writers. For example, Ernest Nagel 
[1961], p. 36 holds that 
At least one of the premises in the explanation of a given law will meet the following 
requirements: when conjoined with suitable additional assumptions the premise 
should be capable of explaining other laws than the given one; on the other hand 
it should not in turn be possible to explain the premise with the help of the given 
law even when those additional assumptions are adjoined to the law. 
A similar requirement is imposed by Tuomela in [1973], pp. 187 ff. The requirement I 
impose is considerably stronger than this requirement and is imposed as a requirement 
on the explanation of singular explananda as well as scientific laws. My discussion 
below makes clear my reasons for imposing requirement (f) rather than Nagel’s require- 
ment, 
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For example, these generalisations are such that on the assumption that 
the mass and radius of the earth had different values, a quite different 
value for the acceleration of a falling body could be derived. These 
generalisations are also such that we could use them to derive an expression 
for the rate of fall of a body falling from a distance which is no longer 
negligible in comparison with the earth’s radius, Indeed these generalisa- 
tions are such that we could use them to derive even more disparate 
explananda; for example, we could use them in conjunction with other 
information to derive Kepler’s laws and a great many other derivative 
laws of Newtonian mechanics. 

Similarly, the law which occurs in Ex. 5 can also be used to explain a 
whole range of different explananda—as Ex. 5 itself suggests it can be 
used to explain various features of the behaviour of a firm under competi- 
tive conditions and it could be used to explain other features of the behavi- 
our of a monopolistic firm—why it will engage in product differentiation 
under certain conditions, for example. The law occurring in (Ex. 5) is 
such that it could be used to show that if the initial conditions facing a 
monopolistic firm were to change in certain ways (assume different 
values), the behaviour of the firm would change accordingly. (If, for 
example, the slope of the firm’s average revenue curve were to become 
less steep, it would lower its prices.) And in a similar way the version of 
Coulomb’s law occurring in (Ex. 6) can be used, as the parameters in this 
law assume different values, to explain a range of different explananda— 
the expressions for electrical intensity along the axis of a uniformly 
charged ring, or between two equally and oppositely uniformly charged 
plates, or inside and outside a uniformly charged hollow sphere, for 
example. By contrast, the generalisations occurring in (Ex. 1-3) do not 
possess these features. For example, ‘Raven’ and ‘black’ are not variables 
which can be used to characterise a range of different values, and ‘All 
Ravens are black’ does not formulate a systematic relationship among 
changes in the values of these variables. The generalisations occurring 
in (Ex. 1-3) are not such that they can be combined with a range of 
different assumptions about initial conditions to derive a range of different 
explananda in the way that the generalisations occurring in Ex. (4-6) can. 

It is this difference between explanations (Ex. 1-3) and explanations 
(Ex. 4-6)—the fact that the laws occurring in the former, but not in the 
latter express a systematic inter-relation between variables which can 
assume a range of different values, and the fact that the former generalisa- 
tions but not the latter can, as they assume these different values, be used 
to derive a range of quite different explananda—which I have attempted to 
capture by means of the requirement of functional interdependence. 
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I turn now to the task of explicating and clarifying the notion of func- 
tional interdependence. First, when are two explananda ‘appropriately 
different’? The idea I want to capture here is the idea that the generalisa- 
tions in a successful scientific explanation will permit the derivation of 
explananda which differ in the way in which, say, the expression for the 
electrical intensity at a distance from a long, uniformly charged wire 
differs from the expression for the electrical intensity between two equally 
and oppositely uniformly charged plates and not merely in the relatively 
trivial way in which ‘a is black’ and ‘b is black’ differ. I think that we can 
capture this idea for singular explananda if we say that two singular 
explananda Ba and Cb are appropriately different if and only if (x) Bx 
does not entail (x) Cx and (x) Cx does not entail (x) Bx. For non-singular 
explananda, I shall say that E and E, are appropriately different if and 
only if E, does not entail E, and vice-versa. 

A more difficult problem arises with regard to specifying the meaning 
of the phrase ‘variations in the value of a variable’. First, why do I use 
this cumbersome and obscure expression at all? Why don’t I adopt the 
following simpler and more straightforward formulation of the require- 
ment of functional interdependence? 


(f’) The law occurring in the explanans of a scientific explanation of 
some explanandum Æ must be such that in conjunction with some 
appropriate set of initial or boundary conditions, it can be used to 
derive an explanation which is appropriately different from E. 


Some of my reasons for preferring (f) will only emerge later in this essay, 
Nonetheless it may be helpful at this point to indicate why a simpler 
formulation like (f’) won’t do. 

Consider to begin with the ‘explanation’ (Ex. 7) 


All ravens are black. All diamonds are green. 

a is a raven 

. a is black 
This explanation meets requirement (f’}—1t is a generalisation which in 
conjunction with other initial conditions could be used to derive quite 
different explananda. Yet it clearly fails to exemplify the pattern we found 
in explanations (Ex. 4-6). If we do not think that (Ex. 1) is an acceptable 
scientific explanation it 1s difficult to see why the addition of the apparently 
unrelated generalisation ‘All diamonds are green’ to the explanans of 
(Ex. 1) should turn (Ex. 1) into an acceptable explanation. Clearly we need 
to require that the generalisation occurring in a scientific explanation be 
such that it can be used to explain a variety of different explananda in 
terms of the same parameters or explanatory categories, 
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But it won’t do either to require simply that the generalisation be such 
that it can be used in conjunction with the same initial condition to derive 
a range of different explananda. Consider (Ex. 8). 


All ravens are black. All ravens are cold-blooded. 
ais a raven 
ais black 


Here both the blackness and cold-bloodedness of a bird is ‘explained’ 
by reference to its raven-ness. Nonetheless if we do not think that (Ex. 1) 
and (Ex. 7) are scientific explanations it is difficult to see why we should 
take (Ex. 8) to be a scientific explanation. It is hard to see how our under- 
standing of why a is black is increased when we are told that all ravens 
are cold-blooded. 

The pattern of explanation achieved in (Ex. 4-6) is in fact quite different 
from that achieved in arguments (Ex. 7) and (Ex. 8). The parameters 
appealed to in (Ex. 4-6) figure in the derivation of a range of different 
explananda, but they do so by assuming what might naturally be de- 
scribed as a range of different values. Informally, we can think of the 
parameters in (Ex. 4-6) as associated with scales containing different 
gradations, the values of these parameters being positions on these scales. 
The generalisations (Ex. 4-6) can then be thought of as showing us how 
certain movements along these scales (certain changes in the value of 
these parameters) are systematically associated with movements along 
other scales (changes in the values of other parameters). For example, the 
generalisations in Ex. 4 are such that they show us how increases or 
decreases in the mass or radius of the earth, or the distance above the 
earth’s surface from which a stone is dropped are associated with cor- 
responding changes in the explanandum. 

It is this feature of the above explanations which I have sought to 
capture by use of the phrase ‘value of a variable’. The sense I want to 
attach to this expression is roughly this: a variable may be said to have 
values when we may associate with the variable an ordinal and not merely 
nominal scale, when it is possible to talk of ‘more or less’ (in some non- 
trivial sense) in connection with the values of the variable. 

Thus, to begin with, I do not understand the requirement of functional 
interdependence in such a way that only generalisations containing 
predicates associated with a ratio scale, like ‘mass’ or ‘length’, satisfy the 
requirement of functional interdependence. Consider, for example, 
explanations of consumer choice behaviour in terms of the generalisation 
that consumers will maximise expected utility, where utility values are 
measured by the method of von Neumann and Morgenstern, which 
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-establishes an interval scale. Such explanations do not merely allow us to 
derive a sentence about how an economic agent will behave in certain 
circumstances, given his utility schedule and beliefs about the probability 
of various outcomes. They involve a generalisation which can be used 
to tell us how, if an economic agent’s utility schedule or probability 
beliefs alter in various ways, his choice behaviour will change accordingly. 
They are thus explanations which satisfy the requirement of functional 
interdependence. (They may of course have other infirmities.) 

A generalisation may meet the requirement of functional interdepen- 
dence even if it does not contain predicates which are associated with an 
interval scale, so long as it contains predicates which are associated with 
a rough ordinal scale. For example, in his essay, ‘Imperfect Rationality’ 
(Watkins [1970]} John Watkins introduces what he calls a ‘step-wise 
likelihood scale’ of subjective probability. This is a scale consisting of 
broad discontinuous gradations which represent a subject’s classification 
of various events as to their subjective probability. The subject may be 
unable to say whether he thinks a war between the U.S. and Canada is 
more or less likely than a war between Britain and France, but he may 
think that both these possibilities fall within a broad gradation which 
places them considerably lower on the likelihood scale than war between 
Israel and Syria. And he may think that the likelihood of war between 
Russia and China occupies some intermediate gradation on the scale. If 
we assume that the value the subject assigns to various outcomes may be 
similarly scaled we might use the ‘law’ that he will maximise expected 
utility to explain why he chooses as he does under a variety of different 
kinds of situations involving uncertainty. Here only crude comparison 
among the subject’s beliefs regarding the probabilities of various out- 
comes and the values he assigns to these different outcomes are possible, 
and we are not, as in the example above, able to measure these on an 
interval scale, but nonetheless at least in some cases it seems possible, 
given the subject’s preferences and beliefs, not only to derive how he will 
choose, but to say how, if his preferences and beliefs had been different, 
he would have chosen differently. Here too, I want to say that it is ap- 
propriate to talk of the ‘values’ of the subject’s beliefs and preferences 
and of how, if those preferences and beliefs had assumed different values, 
he would have chosen in various different ways. Here too I want to say 
that this generalisation may occur in explanations satisfying the require- 
ment of functional interdependence. 

Consider another case. In his recently published study, Religion and 
Regime ([1967]) Guy Swanson undertakes to classify political regimes 
on a scale with five gradations, according to the extent to which govern- 
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mental power in the regime is shared by the members of the political 
community. He also attempts to classify various religions according to 
the extent to which they subscribe to a belief in the immanence of God. 
He suggests that the items on these two scales are systematically inter- 
related—in particular that the extent to which governmental power in 
a regime is concentrated is positively correlated with the degree to which 
the religion to which it subscribes is committed to a belief in the immanence 
of God. Thus, for example, centralist regimes with a single autocrat 
(regimes in which power is most concentrated) are said to be typically 
Roman Catholic (the religion which most emphasises the immanence 
of God), while regimes in which a number of groups play an important 
role in determining governmental policy are said to be typically Calvin- 
istic (the religion which least emphasizes the immanence of God). Regimes 
. in various intermediate positions on the concentration-of-power scale 
are correlated with religions that take various intermediate positions 
with respect to divine immanence (Lutheranism, Anglicanism). 

I do not wish to defend Swanson’s ‘theory’ (which in fact seems to me 
to be rather implausible), but rather to draw the reader’s attention to 
the fact that it seems to satisfy, in an admittedly crude way, the require- 
ment of functional interdependence. In Swanson’s account, various 
gradations in two magnitudes—concentration of political power and 
degree of belief in divine immanence—are distinguished. Swanson does 
not merely advance the hypothesis that some specific degree of concentra- 
tion of political power is correlated with some particular religion. He also 
makes claims about how a range of states of concentration of political 
power are correlated with a range of values of a certain religious variable. His 
theory suggests how if the concentration of political power within a regime 
varies, its religious character will change and in doing so satisfies the require- 
ment of functional interdependence with respect to certain explananda. 

Similarly other predicates associated with step-wise scales which 
establish a crude order—indeed any vocabulary which allows us to talk 
about a range of phenomena in terms of gradations in a few basic para- 
meters might conceivably occur in a scientific law which satisfies the 
requirement of functional interdependence. 

Those cases in which, on my use of the phrase, it will be inappropriate 
to talk of the ‘values of a variable’ are cases which involve what is some- 
times called a nominal scale, a scale which does not indicate an order, but 
only sameness or difference. Suppose, for example, that we say that x is 
a gem if it is an emerald, ruby, or beryl. Suppose that we also say that the 
gem-value of x == 1, 2 or 3 when x is respectively an emerald, a ruby, or 
a beryl and that the colour-value of x = 2,.4 or 6 when x is respectively 
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green, red or blue. Consider now the generalisation (G,) which tells us 
that for any gem, its colour-value will be twice its gem-value. Should we 
say that 1, 2 and 3 are values of the variable ‘gem-value’ and that (G;) 
satisfies the requirement of functional interdependence with regard to 
(A) ‘All emeralds are green?’ Clearly we want to avoid this conclusion, 
for (G,) is no more than the conjunction of (A) and (B) ‘All rubies are 
red’ and (C ‘All beryls are blue’. 

The crucial difference between the artificial predicate ‘gem-value’ and 
the other predicates I have been considering seems to be something like 
this: in explaining what the ‘gem-value’ of a gem was we simply stipulated 
what this value was for various gems. The only requirement on this 
stipulation was that each of the gems we considered be assigned some 
(one) number, and that the same number not to be assigned to different 
gems. Because of this the predicate ‘gem-value’ does not describe an 
order in any non-trivial sense. If we were to discover other kinds of gems 
(say, sapphires or rhinestones) and ask what their gem-values were, or 
how (evén very roughly) their gem-values compared with the gem-values 
of the other gems discussed above, we would have no idea how to go 
about answering this question, even in principle. We may contrast this 
with the variables ‘subjective probability’ or ‘governmental centralisation’ 
which Watkins and Swanson proposed to introduce. It is true that it may 
be difficult to say where many governments or beliefs about likelihood 
will fall on Swanson’s or Watkins’s scale, but nonetheless we have some 
idea about what sorts of considerations are relevant to such classifications. 
Presumably there are some previously unclassified governments or beliefs 
that we would know, roughly at least, where to put on such scales, and, 
we can imagine the addition of further steps or other refinements to such 
scales. These features of predicates like ‘subjective probability’ and 
‘governmental centralisation’ reflect the fact that, unlike a predicate like 
‘gem-value’, they amount to something more than, so to speak, simply a 
list of the values they may take for various arguments. It is only when a 
variable possesses these features—only when in some nontrjvial sense it 
is associated with an ordered scale—that I shall speak of ‘values’ of the 
variable, and it is only when a generalisation is stated in terms of variables 
which may assume different values that it will be possible for it to meet 
the requirement of functional interdependence. 

I want now, by way of conclusion to this section, to note that the 
requirement of functional interdependence is at best a necessary condition 
and not a sufficient condition for an acceptable scientific explanation. 
There are many generalisations which exhibit the pattern we have called 
functional interdependence with respect to different potential explananda 


Scientific Explanation 53 


and yet which should not be regarded as explaining those explananda. 
We may, for example, have a theory in which the values of two variables 
U, and U, exhibit some regular relationship but in which U} and U, are 
thought of as causally unrelated, the regular relationship between them 
being thought of as explainable in terms of some third set of variables 
or conditions. In such a case we may not be able to explain variations in 
the value of U, in terms of variations in the value of U, even though U, 
and U, may exhibit the pattern of inter-relation we have called functional 
interdependence. For example, the Franz-Wiedemann law states that 
k{To is a constant, where k is thermal conductivity, o is electrical con- 
ductivity and T is absolute temperature. But while this law satisfies the 
requirement of functional interdependence with respect to a number of 
different explananda, it is generally not supposed that we might use it, in 
conjunction with statements about the absolute temperature and thermal 
conductivity of a given piece of metal, to explain why the metal has the 
electrical conductivity it does. Rather the electrical and thermal con- 
ductivities of the metal, as well as the general relationship between them 
expressed in the Franz-Wiedemann law are thought of as explainable in 
terms of other features of the metal. 

There is another very closely related feature of scientific explanation 
that talk of functional interdependence does not capture—it is insensitive 
to what is sometimes called the ‘direction’ of explanation. The account 
of scientific explanation presented here has nothing to say about why (to 
use Sylvain Bromberger’s example) ([1966]) we are inclined to suppose 
that we can explain the period of a pendulum by reference to its length 
and yet not inclined to suppose that we can explain its length by reference 
to its period. 

These examples suggest that a fully acceptable model of scientific 
explanation will need to embody some characteristically causal notions 
(e.g., some notion of causal priority), or some more generalised analogue 
of these (e.g., some notion of explanatory priority). I have not, in my 
1] leave open the question of how these causal features of scientific explanation are 

ultimately to be analysed or understood. It may well be that these features can be 
explicated in terms of some model of scientific explanation which does not presuppose 
them. Evan Jobe’s recent discussion of Bromberger’s pendulum example (Jobe [1976]) 
seems to proceed along these lines. According to Jobe, we can deductively explain why 
a pendulum has a certain length without making use of the fact that it has a certain 
period, but any deductive explanation of its period will necessarily involve the fact that 
it has a certain length. This difference, according to Jobe, accounts for our willingness 
to explain a pendulum’s period in terms of its length and our unwillingness to explain 
its length in terms of its period. Thus in Jobe’s discussion the ‘directional’ character 
of scientific explanation is not regarded as primitive but is itself something which can 
be explicated in terms of a more basic notion of scientific explanation. If Jobe’s discussion 


is correct, it provides additional support for my neglect of the ‘directional’ features of 
scientific explanation in this essay, 
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remarks above, attempted to deny this but have rather contended that 
the covering-law model must also be supplemented in a quite different 
way, along the lines suggested by the requirement of functional inter- 
dependence. The features of a good scientific explanation which I have 
attempted to capture via the requirement of functional interdependence 
make a contribution to its explanatory power which is at least in part 
independent of the contribution made by the causal features of scientific 
explanation mentioned immediately above. The difference between 
explanations like Ex. 2 and Ex. 3 and an explanation like Ex. 4 does not 
consist in the fact that the former explanations are causal and the latter 
is not. A D-N explanation may contain a causal law and yet fall short of 
being a scientific explanation in the sense that Ex. 4-Ex. 6 are scientific 
explanations. 


2 While I have attempted to clarify the requirement of functional 
interdependence I have so far said relatively little to motivate its adoption. 
I can best begin to do this by developing the contrast between my account 
of the way in which a scientific explanation provides understanding and 
Hempel’s account in a bit more detail. In his essay ‘Aspects of Scientific 
Explanation’ ([1965], p. 327), Hempel writes* 


The D-N argument shows that, given the particular circumstances and laws 
in question, the occurrence of the phenomenon was to be expected; and it is 
in this sense that the explanation enables us to understand why the phenomenon 
occurred. 


If we confine ourselves to those cases of scientific explanation which 
involve deterministic laws we can say, I think, without serious distortion, 
that for Hempel a scientific explanation explains by exhibiting a nomologic- 
ally sufficient condition for the explanandum, by showing us that, given 
certain laws and initial conditions the explanandum-phenomenon ‘had’ 
to occur, that it could be expected with certainty. 

On my view, in contrast, an adequate scientific explanation provides 
understanding not merely by showing us this, but also by showing us 
how, if matters had been different in certain respects, other outcomes 
besides the explanandum phenomenon would have ensued. A scientific 
1 Cf. also ‘Aspects of Scientific Explanation’, pp. 367-8 where the following condition 

is imposed as a ‘general condition of adequacy for any rationally acceptable explanation 
of a particular event’: 
Any rationally acceptable answer to the question ‘Why did event X occur?’ must 
offer information which shows that X was to be expected—if not definitely, as in the 
case of D-N explanation than at least with reasonable probability. Thus the explanatory 
information must provide good grounds for believing that X did in fact occur; 


otherwise that information would give us no adequate reason for saying ‘That 
explains it—that does show why X occurred’, 
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explanation not only shows that the explanandum phenomenon was to be 
expected, but also enables us to answer questions of the form ‘What would 
have happened if...’. A successful scientific explanation accomplishes 
this by exhibiting the explanandum phenomenon as one of a range of 
states, any one of which might have occurred had initial conditions, 
boundary conditions, and so forth been different in various ways from 
what they actually were. We are then shown why, conditions being what 
they are, the explanandum phenomenon rather than one of these alterna- 
tive outcomes occurred. In effect we are not just shown that the explanan- 
dum phenomenon had to occur, but are given some sense for the range 
of conditions under which it would have occurred. (It is, I take it, clear 
enough how, in meeting the requirement of functional interdependence, 
Ex. 4-6 exhibit these features.) 

But why should this additional information have explanatory signifi- 
cance? One way to appreciate the significance of this additional information 
is to note that if we simply require that an explanans provide a nomo- 
logically sufficient condition for the explanandum we do not insure that 
the explanans is relevant to the explanandum. When we require, in addi- 
tion, that the laws in an explanans be such that they could be used to 
answer a set of what-if-things-had-been-different questions, we help to 
insure that the explanans will perspicuously identify those conditions 
which are relevant to the explanandum being what it 1s. 

Let me begin with a very simple and thus possibly misleading aude 
which is taken from Wesley Salmon’s essay ‘Statistical Explanation and 
Statistical Relevance’ [1971]. Consider the generalisation (Lẹ) “All men 
who take birth control pills regularly, won’t get pregnant’. This generalisa- 
tion is universal in scope and supports counter-factuals. It seems to 
satisfy the usual syntactic conditions for law-likeness.! Can it then be 
used to explain why Mr Jones, a man who has been taking birth control 
pills regularly, fails to get pregnant? That is, is 
(Ex. 9) 

(Z,) All men who take birth control pills regularly fail to get pregnant. 

(Co) Mr Jones is a man who takes birth control pills regularly. 

(Es) Mr Jones fails to get pregnant. 
an acceptable explanation? 

I think it is clear that (Ex. 9) is a defective explanation. While the 
explanans of this explanation does indeed exhibit a nomologically sufficient 
condition for the explanandum, it does not identify a set of factors or 
conditions which are relevant to the explanandum. This is of course 


1 I put aside questions about whether the class of men, as a subclass of a biological species, 
involves an implicit spatio-temporal reference that disqualifies (Z,) from lawlikeness. 
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reflected in the fact that even if Mr Jones stopped taking birth control 
pills, he still would not get pregnant. Given that Mr Jones is a man, 
whether or not he takes birth control pills has, as we say, ‘nothing to do’ 
with whether or not he gets pregnant. 

Now contrast (Z,) with (Lio): 


(Lio) All women who meet condition K (K has to do with whether the 
woman is fertile, has been having intercourse regularly and so 
forth) and who take birth control pills regularly will not get 
pregnant and furthermore all women who meet condition K and 
do not take birth control pills regularly will get pregnant. 


Suppose that (C,,) Mrs Jones is a female who meets condition (K) and 
has been taking birth control pills regularly. Can (L10) and (Cho) be used 
to explain (£,,) why Mrs Jones doesn’t get pregnant? Here of course we 
have considerably more inclination to say that at least a crude explanation 
of Eio has been provided. The reason is obvious—whether or not Mrs 
Jones is taking birth control pills and meets condition K does have a lot 
to do with whether she gets pregnant. Here the explanation given (let us 
call it Ex. 10) identifies and exhibits conditions which are not merely 
sufficient for, but are also relevant to Ej. 

This difference is of course reflected in the fact that the explanation 
given for Mrs Jones’s non-pregnancy shows us, while (Ex. 9) does not, 
how if conditions had been different, a different outcome would have 
ensued. That is to say, L., is such that it could be used in conjunction 
with a statement that Mrs Jones is not taking birth-control pills to derive 
the result that Mrs Jones does get pregnant. By contrast, even if we 
attempt to supplement L, along the lines of Lig, Lẹ will not have this 
feature—it cannot be used in connection with some different set of initial 
conditions to derive some explanandum appropriately different from Æp. 
We may say that the explanandum given for Mrs Jones’s non-pregnancy 
does, while explanation (Ex. 9) does not, identify in a crude way the range 
of conditions under which the associated explanandum will hold. In 
contrast to (Ex. 9), (Ex. 10) shows us why, a certain condition being what 
it is, the explanandum of (Ex. 10) rather than certain alternatives was 
realised and in so doing so shows how this condition (Mrs Jones’s taking 
birth control pills) makes a difference for, or is relevant to, the explanan- 
dum. 

I want to suggest that the features of (Ex. 4) which have to do with 
its satisfaction of the requirement of functional interdependence are 
important because they play an analogous role in the explanation of 
Galileo’s law, because they help to insure that (Ex. 4) identifies conditions 
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which are relevant to and not merely nomologically sufficient for the 
truth of Galileo’s law. It is when we have an explanation which perspicu- 
ously identifies the range of conditions under which Galileo’s law holds, 
which is such that it shows how if various conditions were different in 
various ways explananda other than Galileo’s law would be true, that we 
have an explanation which shows us how these factors are relevant to the 
truth of Galileo’s law. In a similar way it is because explanation (Ex. 5) 
does not merely provide a sufficient condition for its explanandum—that 
a firm facing certain conditions will exhibit a certain kind of behaviour— 
but also employs a generalisation which can be used to show how if the 
firm were faced by various different conditions (if for example, it did not 
face a downward-sloping demand curve) it would behave in a variety of 
different ways, that we can think of (Ex. 5) as perspicuously identifying 
conditions which are not merely sufficient for, but relevant to the behaviour 
of a monopolistic firm. 

In (Ex. 4) and (Ex. 5) we have of course a considerably more detailed 
and general identification of the conditions under which the explananda 
involved obtain than we do in the case of (Ex. 10). (This is one of the 
reasons why (Ex. 4) and (Ex. 5) are considerably better explanations than 
(Ex. 10), which does not qualify as scientific at all according to our stan- 
dards, since it does not involve an ordinal scale.) In an explanation 
involving L,, there are in effect two possible initial conditions which 
may obtain—Mrs Jones either may or may not take birth control pills— 
and two possible explananda—Mrs Jones either may or may not get 
pregnant. In a successful scientific explanation we have a kind of general- 
ised analogue of this feature—the explanation identifies not two but a 
great range of possible explananda, and a range of possible initial condi- 
tions under which these different explananda will be realised. The ex- 
planation explains in part in virtue of showing us how it is that it was 
the explanandum rather than one of these many alternative possibilities 
that was realised and in doing so, perspicuously identifies those conditions 
which are relevant to the obtaining of these various explananda. 

Consider another case. Suppose that C is a consumer (or rational 
agent) who chooses alternative A, over alternative A,. Suppose that we 
undertake to explain his choice behaviour by reference to the generalisa- 
tion: (G,) All consumers (or rational agents) will choose A, over As. 
Now contrast this explanation in terms of (G,) with an explanation of 
C’s choice behaviour in terms of an account like that proposed by Watkins. 
In comparison with the explanation in terms of (G,) the explanation in 
terms of Watkins’s account identifies in a much more perspicuous way 
those factors or conditions which are actually relevant to C’s choice. On 
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the explanation in terms of Watkins’s account we are shown that A, is 
chosen because A, is the choice that will maximise C’s expected utility 
given that C has the preferences and beliefs regarding the likelihood of 
various outcomes that he does. 

On the explanation in terms of (G,) we learn only that C’s choosing as 
he does has something to do with the fact that he is a consumer, or a 
rational agent. Our sense that the explanation in terms of Watkins’s 
account has more perspicuously identified those factors which are relevant 
to C’s choice is reflected, I have been arguing, in the fact that in this case — 
we have an explanation which satisfies the requirement of functional 
interdependence, an explanation which shows us how, if things had been 
different in various ways (if C had a different set of preferences or a 
- different set of probability beliefs), C would have chosen in a variety of 
different ways. 

Consider the last example. Contrast explanation (Ex. 1) above of why 
some raven, 4, is black with what might more naturally be described as a 
scientific explanation of this explanandum. A scientific explanation of 
why some raven is black would, on my view, not simply involve some 
generalisation like ‘All ravens are black’, but would rather involve some- 
thing like this: an identification of those specific biochemical reactions 
within ravens which produce their distinctive pigmentation, and a specifica- 
tion of the genetic mechanisms which are responsible for those reactions. 
An explanation of this kind (let us call it Ex. 11) would, of course, satisfy 
the requirement of functional interdependence. It would, for example, 
presumably involve an appeal to mechanisms which could be used to 
explain why some non-black but otherwise raven-like bird has the colour 
it has. It might also be an account which could be used to explain why 
birds of other related species have the colours they do. It would, in any 
event, be an account which identified the range of conditions under 
which the explanandum would hold, an account which makes clear how 
if the genetic structure of ravens, or their biochemistry were to alter in 
certain ways, their colour would also change. The fact that the scientific 
explanation of a raven’s blackness possesses these features is, I contend, 
closely bound up with the fact that it seems to identify those factors (the 
genetics and biochemistry of the raven) which are relevant to the raven’s 
blackness in a relatively perspicuous way, while explanation (Ex. 1) does 
not. 

This example also suggests another important point about scientific 
explanation. Often our background knowledge will create definite ex- 
pectations about the range of additional explananda a successful scientific 
explanation must be able to explain and thus definite expectations about 
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how the requirement of functional interdependence must be met. Even 
if all the ravens we have observed are black, we know that there are 
‘anomalous’ members of other species which do not have the colours 
characteristic of those species. We also know that the colours of many 
species seem to vary with their geographical location—for example, 
many organisms which are white in colour are found in regions in which 
there is a considerable amount of snow. This background knowledge 
creates the expectation that the account we give of the blackness of ravens 
will have something in common with the account we give of the colour 
of certain other species and because we expect that the latter account 
will say something about the conditions under which anomalously coloured 
members of those species will occur or under which the colouration of 
those species will vary with geographical location, we expect that the 
account we give of the blackness of ravens will do a similar thing. To the 
extent that the account we give of the blackness of ravens remains isolated 
and unintegrated with our background knowledge (and note how this 
is the case with a generalisation like ‘All ravens are black’) we fail to 
provide an adequate scientific explanation. Here, too, we see that we may 
have an account which exhibits a nomologically sufficient condition for 
the blackness of some raven, and yet which, as a consequence of its 
inability to explain certain other explananda, fails to provide a scientific 
explanation for the blackness of that raven. 

There is another way of putting the contrast between my account of 
scientific explanation and Hempel’s account. For Hempel scientific 
explanation is a ‘local’ affair. The question of whether an explanans E; 
explains an explanandum E, is thought of as a question which can be 
resolved simply by focusing our attention on E, and E, That is 
to say, the relation between #, and other explananda which are quite 
different from E, 1s not thought of as relevant to the question of whether 
E, explains Æ,. Given this sort of orientation it is natural to think of Æ 
as explaining E, by exhibiting (in the paradigmatic case) a nomologically 
sufficiently condition for E. (Given this focus on the local aspects of 
explanation, what more could E} do?) 

By contrast, on my view scientific explanation is a more global or 
systematic affair. Whether or not E, explains E, depends in part on the 
relation between Æ and other sentences which are quite different from 
E,. A scientific theory does not confer intelligibility on a set of phenomena 
vta a series of local, independent exhibitions of those phenomena as 
necessitated. On my account the kind of understanding provided by a 
scientific theory rather has to do with the ability of that theory to 
draw together an apparently disparate set of phenomena, to account 
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systematically for these phenomena in terms of variations in the values 
of the same set of parameters in some small set of laws.! 

Finally, by way of conclusion to this section I want to mention two 
additional features of the account of scientific explanation I have sketched 
above. First, we may note that my account enables us to understand why 
the development of abstract, uniform, homogenous vocabularies generally 
plays such a crucial role in the construction of scientific theories. If a law 
or theory is to satisfy the requirement of functional interdependence in 
any very strong way, we must be able to characterise a large range of 
states in terms of variations in the values of a few basic parameters. To 
the extent that the vocabulary of a theory contains sharp qualitative 
transitions and dichotomies rather than continuous gradations, the theory 
will be unable to achieve the uniform treatment of a range of cases neces- 
sary to satisfy the requirement of functional interdependence. Because 
the vocabulary of ordinary language is generally a vocabulary which 
contains sharp qualitative transitions and dichotomies, it is generally not 
a vocabulary which can be used to formulate generalisations which 
satisfy the requirement of functional interdependence to any very significant 
degree. It is largely for this reason that scientific theories are not stated 
in the vocabulary of ordinary language and that the construction of a 
scientific theory generally requires a switch to quite different vocabulary. 
The characteristic flattening out or homogenising of the world which is 
achieved through the use of words like ‘mass’, ‘velocity’, ‘energy’, ‘utility’, 
‘force’, and so forth is not a fortuitous feature of scientific theories but 
rather makes an essential contribution to their explanatory power.’ 

A similar set of remarks can be made about the role played by the 
development of systems of measurement in the construction of scientific 
theories. It has often been noted that the development of precise systems 
of measurement is one of the most distinctive features of modern science, 
and it is commonly held that the scientific status of a discipline is in 
some way closely bound up with the availability of appropriate techniques 
of measurement for that discipline. 


1 The terms ‘local’ and ‘global’ as well as the associated contrast between a conception 
of explanation in which scientific explanations ‘confer intelligibility on individual 
phenomena by showing them to be... mecessary’ and a conception of explanation 
which stresses the systematic features of scientific understanding are taken from Michael 
Friedman’s essay ‘Explanation and Scientific Understanding’ [1974]. However the 
model of scientific explanation Friedman develops differs in a number of respects from 
my own. 

2 These remarks connect up with, and provide a further rationale for Donald Davidson's 
claim that we may typically expect the law ‘underlying’ a singular causal sentence to be 
stated in a technical, non-ordinary vocabulary which differs from the ordinary vocabulary 
in which the singular causal sentence is stated (cf. Donald Davidson [1967]). This point 
is explored in more detail in my ‘Singular Causal Explanation in History’ (forthcoming). 
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My account of scientific explanation enables us to understand why 
measurement should have this sort of significance—it links the explanatory 
power of a scientific theory to the availability of such techniques of 
measurement. By contrast, while there are many reasons that have nothing 
to do with explanation which a Hempelian might have for preferring a 
theory which contains quantitative laws or in which precise measurements 
are employed, it is not easy to see how the presence of these features 
might confer an explanatory advantage on a theory, given a Hempelian 
conception of explanation. 


3 An account of scientific explanation in terms of functional inter- 
dependence also fits naturally with the intuitively attractive idea that a 
successful scientific explanation does not merely show that the explanan- 
dum phenomenon occurs regularly but also exhibits the explanandum 
phenomenon in a new light, allowing one to see the relevance of certain 
considerations which were not apparent from the original characterisation 
of the explanandum. Consider someone who asks for a scientific ex- 
planation of why some raven is black. Such a person (let us call him Q) 
does not, I think, merely wish for a demonstration via a law and some 
statement of initial conditions, that this raven ‘had’ to be black. Rather, 
when Q asks for a scientific explanation of why some raven is black, he 
wishes to know what it 1s about that raven which makes it black. Q is 
puzzled because he is unable to identify those features of a raven which 
are relevant to its blackness or is unaware of the laws governing those 
features. When Q is in such a situation he will not be helped, it seems to 
me, by being told that all ravens are black. This generalisation simply 
tells Ọ that all other ravens have the feature he finds puzzling about this 
particular raven. ‘This information may be of interest to Q, but it will 
serve to generalise rather than dispel his original puzzlement. If Q does 
not know what it is about the explanandum phenomenon which results 
in its behaving as it does, this puzzlement will not be relieved when Q 
is told that the explanandum phenomenon always behaves in the way O 
finds puzzling. 

Instead, Q requires something quite different—an explanation that 
will draw his attention to further considerations the relevance of which 
is not apparent from Q’s original characterisation of the explanandum 
under investigation. This is what the scientific explanation of why some 
raven is black does. It allows us to see the raven not simply as a raven 
but rather as a system having a certain genetic and biochemical structure 
and allows us to see the relevance of the laws governing this structure to 
the raven’s colour. We are shown that the bird possesses one kind of 
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genetic and biochemical structure among the many different kinds of 
such structures different organisms may possess, and that in each case it 
is the character of this structure which is relevant to an organism’s colour. 

I want to suggest that this is a typical feature of a successful scientific 
explanation—that typically such an explanation will do more than merely 
show us that the puzzling explanandum phenomenon occurs regularly, 
that it will allow us to see the explanandum phenomenon in a new light 
by drawing our attention to a previously unappreciated set of considera- 
tions. I shall express this by saying that a scientific explanation typically 
involves a ‘reconstrual’ of the explanandum. It should be clear that an 
explanation which meets the requirement of functional interdependence 
will, at least in many cases, possess these features. This is because an 
explanation which meets the requirement of functional interdependence 
involves seeing the explanandum phenomenon or state as one instance 
of a more generally characterised range of phenomena or states, the 
occurrence of any one of which could be explained in terms of the same 
generalisation, given the occurrence of the appropriate initial conditions. 
An explanation which fails to meet the requirement of functional inter- 
dependence—an explanation like (Hx. 1)—will also fail to exhibit its 
explanandum in a new light or to introduce a set of considerations the 
relevance of which was previously unappreciated. It is this failure which 
accounts, I believe, for the air of triviality which surrounds an explanation 
like (Ex. 1). In imposing the requirement of functional interdependence, 
we require that the generalisation used to explain why some raven is black 
go beyond this explanandum in some more ambitious sense than that 
evinced in (Ex. 1), that it do something more than simply assure us that 
the explanandum phenomenon occurs regularly. 

Consider another example. Even in the relatively primitive account of 
choice-behaviour devised by Watkins, a consumer’s choice of A, over Ag 
is not explained by reference to a generalisation like ‘all consumers will 
choose A, over A,’ but rather by reference to a generalisation to the effect 
that consumers will maximise expected utility. We come to understand 
why this particular consumer has chosen A, over A, not when we are told 
they all do, but rather when we come to see this choice as an instance of 
expected utility maximising behaviour. And this latter ‘scientific’ ex- 
planation involves the introduction of considerations which were not 
apparent from the initial characterisation of the explanandum as a con- 
sumer’s choice of A, over A,. The scientific explanation identifies, in 
virtue of satisfying the requirement of functional interdependence, those 
factors (the consumer’s beliefs and preferences) which are relevant to the 
consumer’s choice, factors which are not identified in any explanation 
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which simply depends on the generalisation ‘all consumers will choose 
A, over A,’. An important part of the scientific explanation of the con- 
sumer’s choice consists in coming to see that these factors make a difference 
to the consumer’s choice—in coming to see that if the consumer’s prefer- 
ences and probability beliefs had been different in various ways he would 
have chosen differently. It is because an explanation in terms of Watkins’s 
theory satisfies, at least in a crude way, the requirement of functional 
interdependence that it provides this sort of information and satisfies our 
demand for a reconstrual of the explanandum. 

There is a tendency among many covering-law theorists to think of the 
explanation of particular facts (¢.e., explanations in which the explananda 
are singular sentences) as a primary or at least typical scientific activity. 
But, in fact, as the examples of scientific explanation we have discussed 
suggest, scientific explanations typically have as their explananda generalis- 
ations rather than singular sentences. One finds, in scientific treatises and 
textbooks, explanations of why simple pendula have periods T = 27,/(I/g), 
of why monopolies are output restrictors, of Boyle’s and Charles’s Law, 
of Bernouilli’s equation, of the expression relating pressure and volume 
in gasses undergoing an adiabatic process. One does not find in addition 
to these explanations a distinct kind of explanation in which, e.g., the 
period of some particular simple pendulum is explained in terms of the 
generalisation, ‘All simple pendula have periods = 27,/(I/g)’, or in which 
the output restricting behaviour of some monopolistic firm is explained 
by reference to the generalisation, ‘All monopolistic firms are output 
restrictors’. To the extent that it makes sense to speak of the scientific 
explanation of such particular cases, the generalisations which would 
figure in such explanations are surely just those that would figure in the 
explanation of the corresponding general case. That is to say, a scientific 
explanation of why this A is B will involve just those generalisations which 
could be used to explain why all As are Bs. In this sense, the scientific 
explanation of particular facts is an activity which is derivative or parasitic 
on the scientific explanation of generalisations. The tendency to suppose 
otherwise plays, I think, no small role in making the picture of scientific 
explanation covering-law theorists have given us, seem plausible and 
attractive. Correspondingly, to the extent we see the explanation of 
particular facts as a derivative scientific activity we will, I think, be led to 
see the picture of scientific explanation covering-law theorists have given 
us as misleading in certain crucial respects. 


4 Consider one last example, which will help to draw together the 
disparate strands of my discussion. In a well-known passage in ‘Aspects 
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of Scientific Explanation’, Hempel describes two different explanations 
which I have distinguished by means of brackets for later reference, of 
why the water level of a beaker containing a piece of floating ice will 
remain unchanged as the ice melts. 


A purely logical point should be noted here, however. If an explanation is of the 
form (D-N), then the laws L., L.,..., L, invoked in its explanans logically imply 
a law L* which by itself would suffice to explain the explanandum event by 
reference to the particular conditions noted in the sentences Ci, Cy,..., Cy. 
This law L* is to the effect that whenever conditions of the kind described in 
the sentences C,, Ca... the explanandum-sentences occurs. Consider an 
example: A chunk of ice floats in a large beaker of water at room temperature. 
Since the ice extends above the surface, one might expect the water level to rise 
as the ice melts; actually, it remains unchanged. Briefly, this can be explained 
as follows: [According to Archimedes’ principle, a solid body floating in a 
liquid displaces a volume of liquid that has the same weight as the water displaced 
by its submerged portion. Since the melting does not change the weight, the ice 
turns into a mass of water of the same weight, and hence also of the same 
volume, as the water initially displaced by its submerged portion; consequently, 
the water level remains unchanged. The laws on which this account is based 
include Archimedes’ principle, a law concerning the melting of ice at room 
temperature; the principle of the conservation of mass, and so on.] None of 
these laws mentions the particular glass of water or the particular piece of ice 
with which the explanation is concerned. Hence the laws imply not only that as 
this particular piece of ice melts in this particular glass, the water level remains 
unchanged, but rather the general statement L* that [under the same kind of 
circumstance, 1.e., when any piece of ice floats in water in any glass at room 
temperature, the same kind of phenomenon will occur, f.e., the water level will 
remain unchanged... clearly, L* in conjunction with C,, C,,...C, logically 
implies E and could indeed be used to explain, in this context, the event de- 
scribed by £.] ([1965], p. 347) 


Whether ‘minimal covering-law’s like L* will always be, in any natural 
or interesting sense of the word, ‘laws’ is an interesting question but one 
which will not detain us here. The questions I want to consider here 
is rather this: assuming that L* is a law, it is a law which could be used 
in a scientific explanation of why (E) the water level in some particular 
glass remained unchanged when the ice in it melted? 

In the above passage from Hempel, I have introduced two sets of 
brackets. The first set encloses a sketch of what might reasonably or 
naturally be regarded as a scientific explanation of Æ. The second encloses 
a minimal covering-law explanation of E. The contrast between these two 
explanations is, I think, quite striking and much of my discussion in this 
chapter can be regarded as an attempt to explicate this contrast. 

Note to begin with the contrast between L* and the scientific laws— 
Archimedes’ principle, the law of conservation of matter and so forth— 
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which figure in the explanation enclosed in the first set of brackets. The 
latter but not the former are stated in that abstract vocabulary which 
makes possible the characterisation of a range of different cases and their 
systematic inter-relation. The laws which figure in the explanation 
enclosed in the first set of brackets are such that they can be used in 
conjunction with different sets of initial conditions to explain a whole 
range or spectrum of different explananda—thus Archimedes’ principle 
can be used, for example, to explain why any solid body floating in a 
liquid displaces the volume of liquid it does. The explanation enclosed 
in the first set of brackets gives us some sense for the range of conditions 
under which the explanandum will hold. When we see the relevance of 
the considerations invoked in the explanans—that a solid which floats 
displaces a volume of liquid equal to its own weight, that no change of 
weight is involved when a solid melts—we see that the same reasoning 
could well be used to show that any solid floating in its liquid form and 
melting will leave the level of the liquid unchanged. Nothing in the 
explanation of E turns essentially on the fact that a piece of ice is involved 
in the above example. We can also see, once we appreciate the relevance — 
of Archimedes’ principle to this case, that it does make a difference 
whether the solid is floating in the liquid. If, for example, the solid sinks 
in the liquid, then it will displace a volume of liquid equal to its own 
volume rather than its weight, and since a given mass of solid will generally 
increase in volume when it melts, the level of liquid in the container will 
rise as the submerged solid melts. 

. By contrast, the minimal covering-law explanation in terms of L* gives 
us no information of this kind, It can be regarded as assuring us, perhaps, 
that the water level in the glass ‘had’ to remain unchanged when the ice 
melted, but it does not answer the ‘what if...” questions that the explana- 
tion enclosed in the first set of brackets can be used to answer. And to 
say this is to say that the minimal covering-law explanation does not 
perspicuously exhibit the factors or conditions which are relevant to the 
water level in the glass remaining unchanged in the way that the explana- 
tion enclosed in the first set of brackets does. One can be aware of the 
information contained in the minimal covering-law explanation, and yet 
largely fail to understand what it is about an ice cube melting in a glass 
that results in the water level in the glass remaining unchanged. One 
may be aware of L* and yet still be unclear whether the fact that the 
water level in the glass remained unchanged has to do with some feature 
peculiar to water and ice, whether it has to do with the fact that a glass 
rather than some other container was employed, and so forth. When we 
ask for a scientific explanation of E we are, I think, interested in the 
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answers to questions like these, not merely in a certification that E had to 
be true. 

We may note also the important role that a recharacterisation or re- 
construal of the explanandum plays in the explanation enclosed in the 
first set of brackets and the complete absence of this feature in the minimal 
covering-law explanation. In the first explanation we begin to understand 
why the water-level of the glass remained unchanged as the ice-cube 
melted in it when we come to see the ice cube as simply one instance of 
a solid floating in a liquid (and hence, according to Archimedes’s principle 
a solid which displaces a volume of liquid equal to its own weight), the 
melting of the cube as a process, which like any other change of state, 
results in no weight change, and so forth. The air of triviality which, by 
contrast, surrounds the minimal covering-law explanation, reflects its 
failure to provide any such reconstrual of E, any such exhibition of Ẹ in 
a new light. 

These dissimilarities between the two explanations considered above 
seems to me to warrant my claim that the difference between them may 
_ profitably be regarded as a difference in kind rather than merely a differ- 
ence in degree. The explanation enclosed in the first set of brackets 
answers a kind of question which is not answered by the explanation in 
the second set of brackets. The first explanation does something more 
than merely exhibit a nomologically sufficient condition for E, and this 
something more makes a crucial difference to its character as an explana- 
tion. It is this difference I have tried to capture in characterising the first 
explanation as scientific and the second explanation as non-scientific. 
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Cause and spacetime 


v A WD N 


INTRODUCTION 


In the ‘Conclusion’ of his interesting book The Existence of Space and 
Time ([1975]), Ian Hinckfuss argues for a very general thesis that ‘the 
relationist programme [for space and time] can always be made to work’. 
I do not think that his argument proves the thesis and, in trying to say 
why, I wish to develop the idea of a geometric style of explanation. This 
is the idea of a kind of explanation which appeals to the geometric proper- 
ties of space itself, which requires an ontic commitment to space and does 
not reduce to a causal explanation in terms of material objects and relations 
among them. 

As you might expect, Hinckfuss offers a complex dilemma for his 
sweeping conclusion. I will not repeat the whole of his argument here. I 
do not think that presenting just its backbone prevents us from seeing 
that it is persuasive and powerful, visibly incomplete though it is. The 
parts of the argument which I wish to challenge are italicised. 

It begins with the convention that “P? is to abbreviate a complex 
dummy description ‘d, & d &...& gdp of spatial or temporal properties 
not yet reduced to relational terms. Assume, further 

(2) Space, and only space, is Y 
which, for present purposes, we take to be contingent. 

Now space being Y is either causally efficacious with respect to some events 
involving matter or it is not. If it ts not, then there is no way tn which we can find 
out that space is Ÿ, so we will not know that space is and we will have no reason 
to believe that there is something, namely space, which is #. In which case an 


ontological reduction of space is as much in order as is an ontological reduction 
of phlogiston. Now let us consider the case when space being Ÿ is causally 
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efficacious. Ÿ may or may not be a property which varies either in strength or 
direction or both from one place to another and from time to time. ... 

If space were ubiquitously Ÿ then we could never set up an experiment in 
which space was not P. That is, any evidence we would ever have that distribu- 
tions of matter of type E, together with space being ¥, always cause distributions 
of matter of type £1, would also be evidence for the simpler statement: 

Distributions of matter of type E always cause distributions of matter of type 
EX, 

Thus in this case, space being Y might well be causally efficacious, but we 
would have no reason to suspect that that was so.... 

Now let us assume that ¥ is a quantity which varies in value and/or direction 
from place to place ... Experiments wherein the type of distribution of matter 
was kept constant while the value of Y was varied would be possible, and such 
experiments would yield evidence for the causal relevance of the variable 
quantity Y of space. 

But, in reply to this the relationalist could always say that space 1s, contrary 
to the considerations, causally inefficacious, or as he would prefer to put it, that 
causal laws are spatially and temporally uniform, and that what the above 
considerations show, therefore, is that some material pervades the whole of space, 
and it is this material (aether or field) which bears the property of having the 
quantity P, which varies from one part of this all-pervading substance to 
another. (pp. 141-2.) 


To show beyond doubt that the geometric style of explanation, as | 
described it, is distinct from the causal style, we would need a determinate 
idea as to what causal explanation is. Such ideas as we have strike me as 
vague. This means that we might use the phrase ‘causal explanation’ 
broadly enough so that any explanation of how a thing has an observable 
property, or how it undergoes an observable change counts as causal. It 
may be that Hinckfuss was using the phrase in this broad sense in con- 
cluding that unless the W-ness of space causally influences matter we 
could never find out whether or not space is W, A generously broad sense 
of causal explanation would immediately yield Hinckfuss the conclusion 
that unless the W-ness of space causally influences matter (or, at least, 
the electromagnetic field) we could never find out whether or not space is 
YI do not know how to give a crisp sense to the ideas of causal explana- 
tion and causal efficacy and J will not pretend to do so. But I intend a 
more robust sense of causal explanation and efficacy than this when I 
say that there is a style of explanation which is geometrical but not causal. 
I take cause to be, fundamentally, a relation among events, causal efficacy 
to require one event to make another happen or, at least, one thing’s 
having a disposition to change another which is manifested under certain 
conditions (elasticity is manifested in collisions, for example). Though 
this is rough, at best, I think it serves to give point to a claim that the 
generously broad sense of ‘causal efficacy’ is a deviant one. 


What Can Geometry Explain? yı 


In most of what follows, causal explanations fall within Newtonian 
mechanics and the idea of causal efficacy is based on the notions force, 
mass, momentum and energy. So something is causally efficacious only 
if it exerts a force on something, if it changes something’s momentum or 
exchanges energy with it, or if it is involved in action-reaction pairs. In 
that sense, I think I can show that we could indeed find out that space is 
Y without space (or the ¥-ness of space) being causally efficacious. But 
if the broad sense of ‘causal efficacy’ is intended by Hinckfuss, then I 
think his thesis is trivialised (as one might expect) in the way described 
in section 4. 

But I now turn, for a moment, to an example which has nothing to do 
with the concepts of classical physics. 


X HANDS AND HANDEDNESS 


My first example is of the familiar property of left-right differences. 
Since we take our own hands to differ in this way, it is a, geometrical 
property of very domestic things indeed. Since I have already discussed 
the problem of handedness in two places, ! I will be brief with it now. 

The handedness of an n-dimensionally asymmetrical object in an 
n-dimensional space depends on a global topological feature of the space 
called orientability. This is best grasped by seeing it as an aspect of the 
shape of the space. Two dimensional examples illustrate the general idea. 
An L-shape is two dimensionally asymmetric if we embed it in the two 
spaces of the Euclidean plane, the cylinder and so forth. But it is not 
handed if we embed it in the space of the Môbius strip or of Klein’s 
bottle. In these spaces an L-shape carried rigidly round the space comes 
back locally congruent with its former reflection. It is homomorphic, 
not enantiomorphic or handed. 

What is being explained here and how does the explanation work? 
Ordinarily, I suppose, we take the difference between left and right to be 
a primitive, simple one which we can gather directly by inspection and 
which is to be defined ostensively. The first part of the explanation 
points out that this is a mistake. Asymmetrical objects may be enantio- 
morphic or they may be homomorphic depending on the rigid motions 
(paths) which the space provides for them. Up to this point, we are 
explaining the concept of enantiomorphy by analysing it in terms of 
asymmetry and orientability, which is a property of space globally. It is, 
therefore, not so far a causal explanation. Whether or not space has this 
global property is not a question of any change which the space enforces 


1 My [1973], 337-51; and more fully in my [1976], ch. 2. 
2? See, e.g., Earman [1971] and Bennett [1970]. 
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upon objects in it, nor does it sustain the handedness of objects by some 
action it performs upon them or some disposition it has to make things 
handed (or not). It is a geometrical question about which pathwise 
connections are in the space. Space closes off no paths to objects. If it is 
like anything outside geometry, orientability 1s like an existential condition 
on space; simply there are not the paths in such a space which permit 
asymmetrical objects to be homomorphic. But whatever one may make of 
that way of classifying what the explanation is telling us, I take it to be 
plain enough that it is telling us nothing causal. 

Another question might suggest itself at this point: can not this ex- 
planation be reduced to something relationist after all? This was the 
theme of a rejoinder by Laurence Sklar to the first version of my paper 
about hands. But Sklar’s method of dealing with the problem of enantio- 
morphy is itself both absolutist and circular, or so I argued in reply.t It 
is circular because when Sklar talks of possible motions, the sense of 
‘possible’ intended has to mean ‘permitted by the pathwise structure of 
orientable (or non-orientable) space’. It is absolutist because motions 
are in fact paths, 1.e. parts of space, and of the same ontological order as it. 
But I will not pursue this argument again here. 

I conclude this section by claiming, contrary to Hinckfuss, that we 
might well discover whether space is orientable or not (if space is finite) 
even though its orientability (or lack of it) does not cause things to be 
enantiomorphic or homomorphic. Furthermore, orientability evades the 
further reductive lemmas in his argument since it is a global property of 
space, not a pointwise definable one of the kind his later argument en- 
visages. It does not vary from point to point nor does it stay the same. 


2 CONSTANT CURVATURE AND WHAT IT EXPLAINS 


What happens when we apply classical physics to motion in spaces which 
have non-Euclidean geometry?® Any such space has a pointwise definable 
property Ÿ of curvature. I will be arguing in this section that, when V 
is non-zero and constant throughout the space, this makes a difference 
to the behaviour of matter. Though classical mechanics predicts the 
difference in the state of things, it is plainly not the case that the constant 
curvature of the space causes the characteristic states of matter.-I take it 
that, in the context of classical mechanics, we can speak of causes only 
when we can speak of forces. I hope to make it quite clear, intuitively, 
that there may well be a causal story to be told in these circumstances 
1 Op. cit., chapter 2, section 8. 


*J am indebted in this and the next section to Professor Angus Hurst, Math. Physics 
Department, University of Adelaide. 
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about the behaviour of matter, but that it nowhere causally involves space 
or its curvature. However the curvature of space is certainly needed for a 
wider style of geometrical explanation as to why things happen as they do. 

The two dimensional spherical surface is a space of constant positive 
curvature. Further, it is intuitively rather obvious to us what its geometry 
is, so I will begin by looking for some analogies there. The straightest 
paths that lie in the surface of a sphere are its great circles. If we apply 
the first law of motion to material points moving on a spherical surface, 
then it tells us that the force-free motion of a particle will be at uniform 
speed along a geodesic, which is one of the great circles. The figure shows 
us the spherical surface and a geodesic E. At various points on the geodesic 
are vectors (arrows) in the surface of the sphere each of which is ortho- 
gonal to E. Clearly, if these vectors are moved along their own lengths 
they will follow geodesics all of which meet at the point P which is polar to 
E. Thus, there are no parallel geodesics in the two dimensional surface of 
the sphere. 

Now consider a cloud of unconnected particles moving at the same 
. speed in a common direction in Euclidean space. Each material point 
continues to move parallel to every other at the same speed and the cloud 
will maintain its shape indefinitely. The velocity of the cloud can there- 
fore be represented by a single vector through its centre. Its momentum, 
equally simply, is given by the same vector, up to a factor of mass. But 
it is seldom so simple in non-Euclidean spaces and clearly more complex 
in the standard simplest cases of constant curvature. The two dimensional 
case of constant positive curvature, the sphere, makes this obvious and 
similar results extend to negative curvature and to three dimensions. 
There are no parallels on the sphere. So, as is again clear from the figure, 
a patch of unconnected particles moving at the same speed through the 
2-space of the spherical surface can move in the same direction only in 
a weak sense. That is, at some particular time, #, the vectors of all the 
particles are orthogonal to members of some family of geodesics which 
meet only in the same pair of polar points. E and F’, in the figures, are 
members of such a family; particles in the patch whose velocity vectors 
at ¢ are orthogonal to Æ or E’ converge on P or P’ respectively at some 
later time ?’. 

The first law of motion entails that these particles will follow the 
geodesics in which their velocity vectors lie. As we saw these all come 
together. Hence the patch of particles will change its shape with time. 
Clearly, we can no longer deal adequately with the mechanics of the 
cloud of particles by means of a single vector through its centre. We must 
treat it as a vector field. So momentum, too, is a vector field, identical 
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with the velocity field up to factors of mass. These fields will be much less 
simply summed than is possible in Euclidean space. Generalising again 
to higher dimensions and negative curvature, it follows that a cloud of 
dust inertially moving in non-Euclidean 3-space of constant non-zero 
curvature will change shape and volume over time. 


A 


B 


Figure 1 Vectors orthogonal to geodesics which meet in the same pair of polar points. 


Here, then, is one example of an observably changing state of matter 
which involves no causes at all, since the motion involved is purely 
inertial, governed just by the first law. No forces operate at all. The 
curvature of space explains the change of shape in the context of classical 
physics but, quite clearly, is nowhere causally involved in it. 

Analogous results hold for the motion of dust clouds in the majority 
of 3-spaces with variable curvature and I draw from them the same 
conclusions about the geometrical, non-causal, style of explanation which 
spatial curvature gives. 

The general problem for a swarm of interdependent particles moving 
in spaces of non-zero constant curvature can be understood just by 
looking at the motion, in two-dimensions, of an elastic membrane moving 
through a spherical surface. Classical physics again entails that each point 
in the membrane will move along a geodesic unless acted upon by a force. 
But now the elastic forces in the membrane will act on its individual 
molecules so as to resist change in the membrane’s shape. Elastic forces 
are electromagnetic and hold among the molecules of the membrane, so 
that stress in the membrane may be regarded as a vector field, Clearly 
this vector field interacts with the vector field of momentum, so that the 
outer molecules have their momentum changed just by stress in the 
membrane which will be in a constant state of tension as it moves. The 
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stress energy is acquired as the objectis accelerated, but since the curvature 
of the space is constant 'the total field of an inertially moving object need 
not change with time, so that the body as a whole will satisfy the principle 
that momentum is conserved. The mechanics of the motion is rather 
analogous to the familiar case of an elastic solid rotating in Euclidean 
space which conserves angular momentum under stress. 

Here again, the results extend in quite analogous ways, to spaces of 
three dimensions and constant negative curvature. In these dimensions, 
stress is a tensor field and this adds some complexities to calculations 
which need not detain us. Quite generally, the mechanics of the inertial 
motion of elastic areas or volumes in spaces which have no parallels! must 
be quite different from the mechanics of their inertial motion in Euclidean 
space. But even in the simplest cases of constant non-zero curvature, an 
inertial uniformly moving body is under stress. Moreover, its stress is 
clearly a function of its speed and of the curvature tensor Y. Thus its 
motion will be absolutely detectable from its internal mechanical state, 
just as for classical physics rotation is detectable in Euclidean space. So 
the inertial motion of elastic volumes is not relative in such spaces; hence 
motion in general cannot be. 

It is clear, I think, that it would be quite wrong to claim that space 
plays a causal role here in stressing the body. The causal story is ex- 
hausted in our account of how the vector or tensor field of stress interacts 
with the momentum field. It tells of electromagnetic forces acting among 
molecules so that some are accelerated and their momentum field is 
changed. That assigns a clear role to the first and second laws of motion. 
The third law is clearly fulfilled in action—reaction pairs among molecules. 
Space no more enters the picture causally than it does when classical 
physics explains the mechanics of a rotating elastic solid. However, that 
space has constant non-zero curvature plays a crucial geometric role in 
explaining why and how much the thing is stressed in its motion. ‘Thus 
it would also follow that the constant zero curvature of Euclidean space 
plays a role, neither trivial nor causal, in explaining how bodies. move in 
it without stress. 


3 VARIABLE CURVATURE AND WHAT IT EXPLAINS 


My next example of geometric explanation makes reference to a variable 
pointwise definable property of space. I want to look at explanations 
which use the curvature of space and I assume that this feature will 
differ from point to point. I will look at how classical physics can be 


1 Spaces that are non-Euclidean may have parallels. For example, the 2-space of the torus 
has parallels but is variably curved. 
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applied to the motion of an elastic solid through a space of variable 
curvature. Again, I hope to show that though causes play a part in ex- 
plaining the oddities of motion in these circumstances, the complete 
explanation contains a part which is clearly not causal but recognisably 
geometrical. I will use the concepts of momentum, energy and elastic 
(electromagnetic) force to show as before that the relevant causal inter- 
actions are all among material particles. Space plays no role in them other 
than to define the directions and distances which they act across. 

There’s a proverb which says that square pegs won’t fit in round holes. 
If the peg were made of soft rubber it might be made to fit the hole if we 
were to exert a force on it, change its shape and induce some stresses in 
the elastic material which it is made of. We will not need a distinctively 
geometrical style of explanation to show us what is going on here, since 
we will not need to make use of any properties of space itself. But suppose 
our peg is a Euclidean cube of soft rubber and our hole an empty region 
of non-Euclidean space. We will just as surely have to change the shape 
of the peg if we are to move it into the hole which means that we must 
exert a force on the cube and stress it, as before. But here we cannot exert 
a force against the constraining walls of the hole, since it has no walls. 
The force is spent wholly in changing the peg; but this is, something we 
need to look at in some detail. Fortunately, we can keep the details qualita- 
tive and largely intuitive. 

Let us begin by looking for some analogies in two dimensions. Suppose 
we have a square piece cut from a flat elastic membrane, that is, one which 
will lie flush on a Euclidean surface without internal stresses. Stresses 
are elastic forces between parts of the material which a thing is made of 
and are basically electromagnetic forces. This square piece cannot lie 
flush with a spherical surface unless the area of the sphere is very much 
greater than the membrane’s area. On the sphere there are no four distinct 
geodesics which bound an area and which intersect orthogonally at all 
four corners. If the piece is to fit flush it must somehow be stretched out 
in some parts or squeezed up in others. We must change the ratio between 
its perimeter and its area since no part of the area of the sphere has such 
a perimeter/area ratio as the piece of membrane takes up in its stress free 
state. 

Clearly, in the light of these facts, if we move a square piece of membrane 
across a surface of variable curvature on which we make the membrane 
lie flush, then it cannot move freely. Let us suppose that frictional forces 
can be neglected. In that case, what impedes the free sliding of the mem- 
brane across the surface can only be that its perimeter/area ratio must be 
readjusted to match the differing curvatures in regions of the surface 
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itself. We must use force on the membrane to stretch or shrink it before it 
can fit flush against new parts of the surface. This provides a simple an- 
alogy of what we will find in three dimensional cases of variable curvature. 

As before, the examples are explained in detail by considering now the 
momentum field changes and is changed by the tensor field of stress. 
Spaces of variable curvature are too diverse to allow a general intuitive 
argument directly about vector and tensor fields which would give a 
useful result. But, as the case of the flat membrane on the sphere shows, 
once the geometry of a space changes from region to region so the dis- 
position of material within an elastic solid must change, as it moves 
through the space. Thus a volume of soft rubber which is in a stress free 
state in a Euclidean region cannot be in that shape if moved into a region 
of non-zero curvature. Just as there is no such shape as a square in the 
space of the spherical surface, so there is no such shape as a cube in these 
non-Euclidean spaces. Therefore if we move the volume of soft rubber 
through a space of variable curvature from a Euclidean to a non-Euclidean 
region it must be stressed out of its Euclidean shape in order even to 
enter or occupy part of the latter regions at all. 

These new aspects of the situation make the mechanics of the motion 
of an elastic solid through a space of varying curvature even more complex 
than in spaces of constant non-zero curvature. Before, it was clear that an 
object acquires stress when it is accelerated and maintains its stress in 
uniform linear motion. But in those cases, the tensor of stress was not 
required to change again once the object moved inertially. Now, however, 
the tensor of stress clearly will have to change with time as the solid 
moves through regions which differ in their geometry. 

In this case, a simple, intuitive but quite general argument from the 
principal that energy is conserved tells us the kind of thing that must 
happen. Envisage the cube of soft rubber moving uniformly through a 
Euclidean region in a stress-free state. It approaches a region of non-zero 
curvature and moves into it. What will happen? As we saw, the stress 
tensor must change and the cube will acquire some energy of stress. The 
principle of conservation of energy in classical physics is a fundamental 
one. It requires that the energy of stress is gained at the expense of energy 
in some other form. The only candidate available in the case of inertial 
motion is the kinetic energy of the moving block. But it can change its 
kinetic energy only by changing its velocity, and momentum. So the cube 
will slow up, veer away from a geodesical path, begin to rotate or in some 
such way behave like a body acted upon by a force. However, it will not, 
in fact, be subject to any external force; the only forces at work are the 
internal elastic ones that bind its parts together. 
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In spaces of variable curvature, we may assume that the three classical 
laws of motion are true of independent particles moving as clouds of dust. 
That momentum is conserved for such particles is still a theorem. But 
it is not also a theorem in spaces of variable curvature for connected 
volumes of matter nor of course, for the individual points of such a 
volume. The interaction of the tensor field of stress with the vector field 
of momentum is so complex that we can say very little in general about 
how they will sum over a volume. Furthermore in certain cases, kinetic 
energy may be gained at the expense of stress energy and the elastic 
volume will accelerate without the intervention of an outside force. 

This suggests that the third law, that action and reaction are equal and 
opposite, is broken. This looks plausible only so long as we overlook the 
inadequacies of treating the mechanics of a moving elastic body by means 
of a single vector of momentum through the centre of mass. But clearly, 
the solid must change its shape and the matter that fills its volume be 
rearranged in order to occupy a region of space with a new geometry. 
This means that molecules must change their distances and orientation 
from one another as the solid moves. But then the intermolecular electro- 
magnetic forces come into play, acting as the second law describes, 
making up a set of action-reaction pairs among the molecules as the 
third law requires. These intermolecular forces change the momenta of 
molecular points in the vector field of momentum. Just how the changes 
sum is complex, but the argument from the conservation of energy shows 
that the nett effect, in general, is to change the momentum of the solid. 
Thus the third law is met at the micro-level in terms of action and reaction 
among the material points of the vo'ume. 

Clearly, space does not enter as a participant into these mechanical 
interchanges. The curvature of space does not and cannot exert pressure 
on the solid, as a round hole might exert pressure on a square peg thrust 
into it. Space absorbs no energy, exerts no force, enters no reaction. It 
plays no causal role whatever, though it very clearly plays an explanatory 
one. ‘The non-causal part of the explanation of how the stress free Eucli- 
dean cube changes its linear motion and acquires stress is simply that the 
space fs not there in regions of different curvature for the matter to be 
disposed in a free state. That is a geometrical style of explanation, making 
reference to the shapes of the space in different regions. 


4 IS CURVATURE REDUCIBLE? 


So far, I have directed the argument against the claim that we can discover 
that space has a property only if it is ‘causally efficacious with respect to 
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some events involving matter’. It is this claim which makes the later 
reductive phase of Hinckfuss’s argument look plausible, for if P is a 
causal property of space then it certainly seems both possible and desirable 
to ascribe it to a material plenum rather than to space. In the case of 
handedness, however, the property Ÿ of space which explains how hands 
are ag they are is neither causal nor local. In the later examples, the 
property ¥ is not causal but at least it is local and may or may not vary from 
region to region. Is it open to us, in these cases, to ascribe the property Y 
to a material field or ether instead of to space? Hinckfuss, be it noted, 
does not explicitly claim that it ts open to us, but it is worth raising the 
question none the less. 

The quantity, Ÿ, in our later example is the tensor of space curvature. 
I am not sure what it would mean to ascribe the curvature tensor to 
matter or the field, but it could hardly amount to more than saying that 
whatever has the quantity ¥ is spatially disposed in the appropriate 
geometrical way. Thus that an ether sea has Y in the way required could 
only be parasitic on there being a space of Y curvature in which the 
material ether could be thus disposed. If that is so, then it remains obscure 
to me, at least, how it zs the material stuff and not space which has this 
quantity and how the geometry of its arrangement is a material property 
of the stuff. On the contrary, we shall certainly be obliged to deny all 
causal powers whatever to the material stuff if the reduced explanation is 
not to be more powerful than the geometric one and take us beyond the 
observed facts. But then, what is meant by describing as material (rather 
than, say, as spiritual, nugatory or null) an omnipresent, eternal, un- 
changing and unchanged somewhat which pervades the whole of space? 
What can be meant by regarding it as causally efficacious if it is strictly 
required to add nothing to the explanation but only to fill up the ap- 
propriately curved space? 

What we can do, perhaps, is to substitute the word ‘ether’ for the 
word ‘space’ throughout our earlier descriptions and legislate that the 
result is true. But it would by no means follow that we would then have 
a deeper explanation, that we would have really dispensed with space or 
with the geometrical explanation or that anything whatever would have 
got the least bit clearer in the process. We might well succeed in puzzling 
ourselves considerably over what the idea of the material in the description 
can really be and we would have run a serious risk of making it quite 
vacuous. The result overall would hardly seem to merit our regarding 
it as any kind of reduction. 

Hinckfuss claims (pp. 141-2) that, even where space’s being Ÿ is 
‘cdusally efficacious’ then, if F is constant, ‘we would have no reason to 
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suspect that that was so’. Any evidence for that fact would be evidence 
for the material causal statement: 


Distributions of matter type E always cause distributions of matter type Æ. 


But what can it mean to speak of distributions of matter if not its distribu- 
- tions in space? What is a material thing itself if not a spatial object? Where 
Ÿ is the affine property of curvature then, clearly, we cannot specify the 
E and E, type distributions of matter in a way which yields the material 
causal law without thereby stating rather directly that the regions in 
which it is distributed have constant curvature ¥. This is true even if we 
specify only the positions of particles (up to an affine transformation). 
But this alone will still not give us the material causal law as a truth 
without further specifying each velocity vector (up to an affine trans- 
formation). If the two types of distribution are always to be correlated 
this, in turn, entails directly that the curvature is a global constant. I 
suspect that a similar result must hold for any pointwise definable, 
geometric property P, whether this be curvature, the metric tensor or 
even some merely projective or topological concept. Which properties, 
other than such geometric ones, are plausibly at issue in the argument? 
In fact, the constant curvature of space (or whatever ¥ may be) could 
escape our notice only through a kind of wilful blindness to the plain 
import of the material causal law, since we would have the most impeccable, 
even irresistible reasons for perceiving the import. The only proviso to 
this is that the curvature might be so slight as to elude detection by our 
most sensitive apparatus of observation. But then the material causal law 
would elude detection, too. 


5 CAUSE AND SPACETIME 


Do the preceding examples illustrate a causal, geometrical style of 
explanation which also holds sway in General Relativity (GR)? That 
remains a complex question even if it is true that we have satisfactorily 
isolated and illustrated geometric explanation. Some brief remarks on 
this difficult matter will end this paper. 

On the face of it, GR provides a very strong example of Pomat 
explanation since not only is spacetime curvature the fundamental ex- 
planatory concept of the theory, but the idea of spacetime geometry is 
actually used to reduce causal explanation by gravitational force in space 
during time. If spacetime is flat (i.e. Minkowskian or pseudo-Euclidean) 
then a geodesic or linear path in spacetime projects onto a motion, unifotm 
in time, along a geodesic or linear path in space. That is the case in Special 
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Relativity (SR). So in SR Newton’s first law applies both in space and 
spacetime: the path of a force-free body is linear in space and uniform in 
time and its spacetime world line (or trajectory, for short) is linear, too. 
But in GR, where we suppose that spacetime is not merely curved, but 
variably curved, it is no longer the case that a geodesic of spacetime can 
always be projectéd down into a geodesic of space. So while the trajectory 
of a force-free thing will be a geodesic of spacetime (and linear there) 
it may yet yield an almost arbitrary curve, including even a closed curve, 
when we project it into space; further, motion along it need not be uni- 
form. That is true of the spatial path of a planet: it is a closed curve in 
space, but an open geodesic of spacetime. The body moves along its 
spatial path, not uniformly, but according roughly to Kepler’s law of 
equal times. Given just that space and time picture of its motion we would 
have to regard it there as moving under a gravitational force. But we can 
still apply Newton’s first law to it in spacetime, seeing its trajectory there 
as that of a force free body, thus reducing gravitation to spacetime curva- 
ture—cause to geometry. But we can do this only by giving up the space 
and time language of enduring continuants. The force reduction requires 
a new ontology of us. 

If it were all as simple as that, then we might conclude, at once that 
explanation in GR is a novel, powerful and more advanced kind of ex- 
planation of the style defined and illustrated in the earlier sections. But 
it is not so simple. The distribution of matter, people say, affects spacetime 
curvature and that sounds causal: the structure of spacetime is caused 
by the distribution of matter, But, as we saw in the last section, matter 
can only be distributed as the structure of space or spacetime permits. 
We cannot say either of the distribution or of the structure that the one has 
a causal or more widely explanatory priority over the other. Certainly 
matter distribution and spacetime curvature constrain one another, but 
it is not yet clear that we can say more than this. 

The correct way to view GR is as a field theory and this might tempt us 
to claim that the individual source terms for the gravitational or curvature 
field are the individual mass-energies of bits of matter and not vice-versa. 
This is largely true, but there are some subtleties in it. The gravitational 
field may be well.defined for empty spacetime not just in the case of flat 
Minkowski spacetime but for much less trivial structures. Nevertheless, 
quite certainly, one source term of curvature is the mass-energy of the 
body. However, this is not a characteristic constant of the body which 
measures something like the quantity of matter in it, as Newton thought 
it did. It is a function of its inner stress, too, and this is itself affected by- 
the gravitational influence of the mass-energy distribution round it. GR 
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is not a linear theory in which the contribution of various material objects 
to the total field can be simply summed.* 

Nevertheless this does seem to leave open the possibility of causal 
intervention by us to change the gravitational field. I compress a body. 
This changes its mass energy however minutely and thus alters its contribu- 
tion to the gravitational field and, it would seem, to spacetime structure. 
So I can act upon geometrical structure even if I produce an effect rather 
indirectly. 

This is not quite straightforward, however. The reason why is a counter- 
part of the reasons why spacetime curvature is reductive of explanation 
in terms of gravitational cause. The ontology is wrong. If spacetime 
structure were affected by an action then it would be changed by it. 
This ought to mean that, at one time, the curvature of a spacetime region 
has one value and, at another time, another value. But that is strictly 
unintelligible unless we erect a further dimension of time within which 
spacetime could undergo such a change. I imagine that philosophers are 
likely to differ widely as to what they would take the significance of that 
consideration to be. 

I will not try to settle the matter. Clearly, the situation in GR is much 
more subtle than it is in the cases discussed earlier where we seem to 
have very clear reasons for saying that geometric structure may explain 
the behaviour of matter without in any way causing it. No doubt GR 
has changed and is changing our understanding of both the material 
and the spatial. I believe we can see better how this goes on if we look at 
GR side by side with simpler theories. The theories described in the 
bulk of this paper are simpler than GR in two ways: first, no basic law 
connects the structure of space with the density, flux etc. of mass energy; 
second, space is not considered as a projection from spacetime. What I 
think the comparison reveals is that giving space a role in physical ex- 
planation need not, by itself, take us any nearer to showing that space 
may be understood as material when we treat it as real. It is always open 
to us to say that spacetime is a material field. Of course, the field can be 
regarded as material only in a somewhat attenuated sense and there can 
be little doubt that field theories have changed our concepts of the material 
and the physical. Hence, it is by no means clear that to describe spacetime 
as a material field accomplishes a material understanding of space and 
spacetime rather than a geometrical extension of the concept of matter. 


University of Adelaide 


1 See Graves [1971], section 13 esp. p. 227. 
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Reviews 


MACKENZIE, Brian D. [1977]: Behaviourism and the Limits of Scientific Method. 
London: Routledge and Kegan Paul (International Library of Philosophy 
and Scientific Method). £4.95. Pp. 193. 


According to the “standard” account, behaviourism was born in 1913 as a result 
of John B. Watson’s polemical paper “Psychology as the Behaviorist Views It”. 
The opening lines are now notorious: 


Psychology as the behaviorist views it is a purely objective experimental 
branch of natural science. Its theoretical goal is the prediction and control 
of behavior. Introspection forms no essential part of its method. ... 


The movement was mortally wounded some half-century later by Chomsky’s 
review of B. F. Skinner’s work on verbal behaviour. But even strong critics 
acknowledge that the movement has shaped the direction taken by much 
contemporary psychology, and it has left both a negative and a positive legacy. 

“Standard” accounts are sometimes superficial, and one of the merits of the 
book by Brian D. Mackenzie is that it attempts to place the heady fifty years 
from Watson through Skinner into the broader perspective of post-Darwinian 
psychology on the one hand, and late nineteenth and twentieth century 
philosophy (particularly philosophy of science) on the other. 

Issues long-neglected are forcibly brought to the reader’s attention: just what 
was wrong with the introspective methods that Watson rejected; are the 
methodological needs of comparative psychology precisely the same as those 
of human psychology; was Lloyd Morgan, together with his canon (or principle 
of parsimony), necessarily on the side of the angels; and—-perhaps most basic 
of all—why did the behaviourists make the fatal mistake of talking to the 
positivists? 

Mackenzie, however, is not content merely to discuss these issues. He argues—- 
correctly, no doubt-—that, considered as a broad movement, behaviourism 
cannot be regarded as a paradigm in the Kuhnian sense (whatever that sense 
may be); for what bound members of the movement together were methodo- 
logical considerations and not theoretical commitments. In Lakatosian terms— 
which, strangely, Mackenzie neither uses nor mentions—behaviourism appears to 
have been a programme with a heuristic but without a hard core. 

Unfortunately, Mackenzie does not stop here either. He proceeds to delineate 
a two-stage theory of the progress of science (the “context of construction” and 
the “context of reconstruction”), and takes pains to argue that while the 
behaviourists’ positivistic viewpoint was useful during one of these stages, a 
rival realist view would have been more viable at other times: 


In the context of construction realism is most appropriate, since what is 
required is the systematic development and elaboration of theories. In the 
context of reconstruction, positivism is most appropriate, since what is 
required is the critical examination, analysis, and dismemberment of 
theories (p. 141). 
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T'his, of course, is essentially a psychological viewpoint about the degree of 
tenacity and so forth that adherents of various “philosophical” positions happen 
to possess, and in his somewhat lengthy discussion Mackenzie does not offer 
the kind of evidence that is appropriate to establish such psychological 
generalisations. Indeed, the indications are that he seems to regard it as a 
“logical” point about these positions that they inspire these attitudes—somehow 
it is necessary that realists are not as ready as positivists to criticise, analyse, and 
dismember. Surely there are readers of the British Journal for the Philosophy 
of Science who will rebel at such a suggestion! 

Mackenzie’s book nevertheless provides a great deal of food for thought, 
although it must also be pointed out that it is dished up in somewhat unpalatable 
form. ‘There is a great deal of repetition; and heavy use is made of secondary 
sources (in one three-page section, for example, Mackenzie gives five long 
quotations from the same secondary source, and in the subsequent half dozen 
pages there are another two, half-page quotations from the same book). And 
the book is not for neophytes—although the work of Watson, Hull, Skinner 
and others is critically assessed, the reader is never told what this work consisted 
of. One has to have done one’s homework, or have a copy of some primer 
readily at hand (such as Howard Rachlin’s Introduction to Modern Behaviorism). 


D. C. PHILLIPS 
Stanford University 


GEBER, B. A. (ed.) [1977]: Piaget and Knowing: Studies in Genetic Epistemology. 
London: Routledge and Kegan Paul. £5.75. Pp. x+258. 


Despite its sub-title, this book is not primarily an epistemological study. It is 
aimed at the advanced student and research worker in psychology. Both groups 
are supposed to be in a state of ‘fragmentation’ with respect to an over-all picture 
of psychological reality: they are thoroughly conversant with a number of 
separate domains of psychological research, but are unable to ‘synthesise’ these 
into a theoretical conception of the whole. The aim of the book is: 


...to focus on synthesis through examination of the relevance of the 
theories of Jean Piaget to particular areas in psychology. (p. 1%) 


Unfortunately, but not surprisingly, such a synthesis is not pulled off. Indeed 
it is not even seriously considered in any of the papers except the first by Roger 
Holmes, ‘Empiricism and Psychoanalysis: a Piagetian Resolution’, where an 
unfortunate style and some confusion ensure its failure. The implicit, and 
philosophically interesting, methodological claim in the book’s approach is 
that a unified psychological theory can be achieved through developmental 
psychology. But this claim is nowhere discussed nor put into practice. Instead, 
a philosophically out-moded ‘building-block’ view of the theoretical synthesis 
emerges. By compiling sufficient data, and in the case of some parts of the book, 
differing interpretations of the same data, it is hoped that a unified, theoretical 
picture will result. But, of course, the construction of a unified theoretical 
edifice cannot be expected on the basis of a mere survey of problems within dis- 
parate sub-fields of psychological research, even if there is some attempt within 
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these areas to form a critical assessment of competing theoretical interpretations. 

In the case of Holmes’s paper, the attempt is not to construct a synthesis of 
various empirical theories, but to provide an epistemological framework in 
which a deep-seated ontological dispute between empiricism and psychoanalysis 
about the nature of psychological reality can be resolved. This is characterised 
as a conflict between the concern for the ‘public present’ versus the ‘private 
past’. A ‘peaceful co-existence’ is maintained, says Holmes, through the accept- 
ance by the adherents of both viewpoints of a naive belief in the objectivity of 
the world independent of the subject. This ‘out there’, as Holmes calls it, is 
then employed by each group to its own advantage. The psychoanalysts’ concern 
ig with the pathological t.e. those not accepting the ‘out there’; and the empiri- 
cists employ the notion of health to discard to the psychoanalyst’s couch any 
abnormal observer who refuses to agree about the nature of certain ‘immediately 
observable facts’. (According to Holmes (p. 17) anyone who looked down 
Galileo’s telescope would see Jupiter’s moons; and those that did not would 
be material for the Psychoanalyst.) 

Holmes claims that this co-existence is illegitimate because each discipline 
cannot account for the division upon which it depends. He proposes that we 
abandon the idea 


...that there is one ‘present’ that all right-minded men subscribe to. 
That people should agree at all must be seen as the problem to be resolved, 
not the baseline on which to build. (p. 20-1) 


This problem, which amounts, I think, to Kant’s Problem put in a develop- 
mental perspective, is the one which Holmes sees Piaget solving. Unfortunately 
the presentation of the problem has left a great deal of confusion, at least in 
this reader’s mind. Holmes sees his problem as that of accounting for the 
emergence of an objective world from an interaction of a ‘formless subjectivity’ 
with an unknown world. For Piaget, however, the subject is never formless; 
he begins the developmental process with inherited neurological and physio- 
logical structures giving rise to behavioural schemes of action. And, while the 
world may be unknown for Piaget, it is also structured and objective. 

From this basis, nothing but obscurities follow. One of the most seductive of 
these, from the standpoint of evolutionary epistemology, is the erroneous 
argument for the species-relativity of truth: 


The way we know is part and parcel of the equipment we possess. As such 
it should be judged ... not as being ‘right’ or ‘wrong’ but as being helpful 
or unhelpful... This is not to say that knowledge cannot be ‘right’ or 
‘wrong’ but rather the fact that we as a species think that something is 
‘right’ or ‘wrong’ need have no conclusive force: it merely suits us from 
the point of view of our survival to think the way we do. Different species 
survive in different ways and so no doubt ‘know’ different things. This 
goes for ourselves as well: the sights, sounds, colours and poetry we all 
recognize are those that enable us to live. It is insufferable to assume that 
they should be more (p. 32). 


Holmes’s excessive use of single inverted commas to employ familiar words 
in unfamiliar ways begs all the important questions, here, as elsewhere in the 
paper. Different species may ‘know’ different things but, to employ Spinozistic 
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language, we know that we know. Moreover, we change what we ‘know’, our 
theories, by criticism and refutation. Thus our evolution extends into the 
growth of knowledge which is regulated by the notions of truth and validity. 
These notions, applied to a particular theory are not conclusive, but neither do 
they merely suit us for survival. In short, Holmes has made the error of making 
truth subservient to survival—an error to which an evolutionary epistemologist 
may unwittingly fall—and so left the problem he set out to explain (namely 
the emergence of an objective world) entirely unaccounted for. 

This is not surprising for one who writes: “begging the question—taking 
definition for granted—is the price we pay for intelligibility’ (p. 21). For the 
reviewer, however, such question begging had the opposite effect. Another 
major problem in the paper is the ‘blitzkrieg rate’ at which Piaget’s resolution 
of the problem is articulated. This turns Piaget’s often confusing exposition 
into chaos. One example is Holmes’s introduction of Piaget’s notion of equili- 
brium: 

An equilibrium is a state of affairs that reverses time. With an equilibrium 
it is easier to predict the future than describe the present. 


Although most of the other papers are restricted to the discovery of empirical 
correlations between various factors in cognitive development and hence are 
primarily of interest to psychologists, there are some in which theoretical 
interpretations of the facts are presented and criticised. These are of philo- 
sophical interest for two reasons. They provide a case study of the growth of 
scientific theories through criticism and refutation. And, secondly, they are 
indicative of the connections between empirical questions about the way people 
actually acquire knowledge and reason on the one hand, with the normative 
discipline of epistemology and logic on the other. 

Two such papers, more closely associated with first point of interest, are 
those by Peter Bryant and Hans Furth, respectively entitled ‘Logical Inferences 
and Development’ and ‘The Operative And Figurative Aspects of Knowledge 
in Piaget’s Theory’. Bryant attacks Piaget’s interpretation of the experimental 
results concerned with formation of transitive relations in ‘concrete operational 
children’, children possessing rudimentary logico-mathematical structures 
which are usually acquired between the ages of five and seven. Piaget claims 
that if a child is shown two sticks, A and B, of unequal length such that 4 > B 
and slightly later, shown B and C such that B > C, then the failure of the child 
to conclude that A > C is evidence of the failure of the child to possess the 
concrete operational structure requisite for the understanding of transitive 
relations. Bryant points out the existence of several alternative possibilities. 
The child might simply have failed to understand what was required of him; 
he might have failed to register the information necessary to draw the desired 
conclusion; or, he might have received this information but then forgotten it. 

To test these possibilities Bryant modifies Piaget’s experimental design. This 
involved the use of five sticks, labelled A, B, C, D, E, presented to the subjects 
in pairs. This was done to avoid the possibility that Piaget’s positive results 
were due to ‘verbal parrotting’, the conclusion that A is longer than C solely on 
the grounds that previously the subject was told that A was longer than B. 
With the use of five sticks presented in pairs, the possibility of parrotting is 
eliminated for the three sticks B. C, D. since each would have been both the 
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longer and the shorter stick. Thus the crucial pair was BD. Bryant found, 
contrary to expectation of Piaget’s theory, that four year-old children tended 
to conclude that B > D. Subsequent modifications, designed to eliminate the 
possibility of ‘end-point associations’, also contradicted the Piagetian expectation 
that four year-olds would not possess any concrete operational structures. 

In his paper, Furth challenges Bryant’s conclusions. He gives an excellent 
summary of Piaget’s distinction between figurative and operative knowing. 
The former is basically perceptual and concerned with properties of a perceived 
object, while the latter is basically conceptual and concerned with the nature 
of possible transformations from one state of an object to another, given a 
particular operational structure. Furth claims that Bryant’s results are a reflection 
of the figurative plane which is epistemologically dependent upon the operative. 
That is, Furth makes the Piagetian claim that operative knowledge, knowledge 
about the relations between possible transformations of the states of some 
object, is epistemologically superior to perceptual or figurative knowledge in 
that the perception of some states of an object may be dependent upon the 
prior possession of certain operational structures. This, of course, is highly 
reminiscent of Kant’s epistemological position in which the perception of the 
world is dependent upon the possession of the requisite categories in order for 
our perceptions to be connected and synthesised. 

Furth reports three experiments to support his claim. One of these is directly 
related to Bryant’s experiment above. The rationale underlying this experiment 
is that if subjects claim that given relations are transitive in some experimental 
design both when this claim is appropriate and when it is not appropriate, then 
that is evidence that they do not understand the concept of a transitive relation. 
Furth goes further than this, however, and the basis of Piaget’s theoretical 
distinction between figurative and operative knowing, concludes that such a 
situation is evidence for the fact that the subjects concerned must have based 
their ‘transitive’ conclusions upon figurative or perpetual cues which, in turn, 
were not supported by any operational structure. 

The experiment concerned involved the use of five stick men, each with a 
distinguishing characteristic, e.g. Mr Nose, Mr Ears etc. All five men were of 
similar height, but five-year-old subjects were trained to identify the first of 
any pair of stick-men AB, BC, CD, DE, as the taller. Later, when the five men 
were presented in random pairs, it was found that, in the crucial case (‘is B 
taller than D?) 80 per cent of the four year-old subjects made the transitive 
reply. However, in a similar experiment in which the only difference was that 
the relation holding between the pairs of stick-men was an arbitrary one—in 
one pair Mr Nose gets the red button and Mr Ears gets the blue one, in the 
next Mr Ears gets red while Mr Tie gets blue—it was also found that 80 per 
cent of the subjects when asked about Mr Nose and Mr Tie made what was 
now an inappropriate ‘transitive’ reply (Nose gets the red button and Tie the 
blue). 

mi contrast, the same two tests were presented to eleven-year-old children. 
This time there were similar results on test 1, the case of the ‘taller than’ relation. 
But in câse 2, the ‘button-colour relation, 83 per cent of the subjects made the 
appropriate reply that they did not know which colour buttons are assigned to 
Mr Nose and Mr Tie. 

The second point of interest can perhaps be best highlighted in Peter C. 
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Wason’s “Theory of Formal Operations: A Critique’. Wason questions Piaget’s 
contention that at the level of formal operations (the highest level of child 
development usually achieved between eleven and fourteen years of age) 
conceptual thought is separate from its ‘content’, the actual situation to which 
logico-mathematical structures of that stage may be applied. This claim is 
substantiated empirically by an ingenious experiment. Subjects are shown 
four cards each containing a vowel on one side and a number on the other. If 
the symbols A,D, 4,7 are shown, they are asked to turn over the least number 
of cards to determine the truth-value of the following sentence: ‘If a vowel is 
showing on one side, then an even number is on the other’. Wason found that 
‘reasoning was radically affected by content in a systematic way’ (p. 132). 
These include a tendency to concentrate only upon symbols directly referred 
to in the test sentence, 1.e. vowels and even numbers such that the importance 
of the seven was overlooked, and to find solutions to isomorphically structured 
problems more easily when that structure is presented in more concrete and 
realistic terms. 

In sum, the failure of the book’s only epistemologically orientated paper and 
the paucity of theoretical discussions of empirical issues raised by Piaget’s 
theory lead me not to recommend it, except as a source book for summaries of 
some experimental results. 

ROB HORWOOD 
London School of Economics 


SPINNER, H. [1974]: Pluralismus als Erkenninismodell. Frankfurt am Main: 
DM 9:50. Pp. 300. 


Helmut Spinner is for ‘pluralism’ and ‘fallibilism’, and he is against ‘monism’, 
‘certism’, ‘fundamentalism’, ‘justificationism’, ‘exhaustionism’, and ‘the in- 
differentism of total scepticism’. (‘This list could easily be extended: it has been 
extracted from his book by going through just a few pages.) His book, which 
consists of three separate articles, presents a plea for ‘theoretical pluralism’ 
which Spinner takes to be far superior to its Popperian forefather called ‘falli- 
bilistic criticism’. As this plea seems to be Spinner’s main concern I shall 
concentrate on it in what follows. I shall argue that those of his claims for 
theoretical pluralism which are reasonable have been commonplace for decades, 
especially in the Popperian tradition, while those which are not commonplace 
are not reasonable. 

The reasonable part comes to this: given some field of research, it will usually 
be advantageous for the progress of our knowledge if there is more than just 
one line of reasoning to be pursued. New theories may give rise to the discovery 
of new facts which are relevant to our field of research, and which are not 
covered by an already existing well-corroborated theory or are even incom- 
patible with it. Furthermore, new theories may also lead to a reinterpretation 
of what had been taken for facts in the light of some previous theory. All this 
of course is not just a matter of a plurality of theories but, much more importantly, 
of a plurality of good theories. Indeed, one great and fruitful theory is far more 
desirable than a thousand third-raters. Already on this last point Spinner ig 
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somewhat ambiguous; occasionally he gives the impression that it was just a 
matter of quantity rather than of quality: 


More (and as a consequence of this also better) theories are achieved along 
two initially very different lines... : by increase of the rate of births on 
one hand and by decrease of the rate of deaths...on the other hand 
(p. 90; the translation of this as well as of all the following quotations is 
mine). 


I don’t think that it is as easy as that. Good theories are not brought about by 
pluralism, they are a precondition for pluralism to be of any value at all. But if 
Spinner agrees, and I think he does, with this, then he has told us nothing new. 
It has always been a standard feature of Popperian philosophy of science that 
we may look out for promising alternatives (hard as they may be to come by) 
even if there happens to be a well-corroborated theory in existence. This is 
nothing but an aspect of the general undogmatic attitude which has been 
preached by Popper (and also by many of his positivist adversaries of the 
Vienna Circle) ever since the 1930s. Spinner’s enthusiastic references to Paul 
Feyerabend as the great innovator who pretendedly developed theoretical 
pluralism from Popper’s merely ‘monotheoretical’ view are plainly mistaken. 
This relates not only to ‘proliferation’ (increase of birth-rate) but also to ‘tena- 
city’ (decrease of death-rate), though there is more to be said about the latter. T'o 
speak in terms of ‘births’ and ‘deaths’, ‘kills’ of theories, or their ‘death-struggle’ 
is all very reminiscent of Imre Lakatos, and so one finds another one of Lakatos’s 
contentions—and a very mistaken one at that—occupying a prominent place 
in Spinner’s claims. It is the claim that Popper conflated refutation with re- 
jection, and therefore would not give a new, not yet fully developed theory 
enough time to grow stronger. Thus 


. any defeat amounts to a practically final ‘falsification’ (as even Popper 
claims or rather postulates) which leads to the final, possibly premature 
elimination of a theory (p. 91). 


Nothing of that kind was ever held by Popper. Of course, given some undisputed 
refuting instance (accepted falsifiers), any theory in conflict with it is falsified. 
This is purely a matter of logic. On the other hand, the decision to eliminate 
(abandon) a theory is a matter of methodology which depends on various factors, 
falsification being only one among them. But all that needs to be stressed at this 
point is, that there is certainly not an ‘. . . identifying linking of refutation and 
rejection of scientific theories in Popper’s concept of falsification’ (p. 214). 
Quite the contrary. Popper has repeatedly stressed that the rejection of a theory 
is a risky affair. He actually calls for a certain kind of dogmatism (though 
‘cautiousness’ might be more appropriate) without which ‘... we could never 
find out what is in a theory’ ([1963], p. 312n.). ‘If we accept defeat too easily, 
we may prevent ourselves from finding that we were very nearly right’ (op. cit. 
P- 49). 

What I have objected to so far is only Spinner’s claims for originality in 
Feyerabend’s and his own ‘theoretical pluralism’. But I do not dispute the 
originality of certain further ‘insights’ which Spinner offere. There ig one 
passage where he presents them all together: 
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The transition from the [Popperian] monotheoretical model of falsifications 
to a genuine pluralistic test-model is marked by the insight, that the 
postulated maximum of criticism can only be achieved with the help of 
alternative theories—whereby Feyerabend’s most original point lies in 
his thesis that this is generally valid even for empirical criticism, because, 
firstly, there exist scientific theories which can in principle only be refuted 
by (alternative) theories; and, secondly, because there also exist facts 
which are relevant critical instances—under certain circumstances even 
the only possible falsifiers—in respect to certain theories, and which 
[facts] can as well in principle only be discovered with the help of (alterna- 
tive) theories. From this follows Feyerabend’s conclusion, that ‘the best 
criticism is provided by theories which can replace the rivals they have 
removed’...—therefore [this criticism] is precisely not [provided] by 
Popper’s ‘falsifying hypotheses’ which can be ‘of a lower’, even ‘a very low 
level of generality’, need not be strictly general like real theories ..., and 
are obviously no alternative theories. (p. 257) 


With ‘maximum criticism’ Spinner refers here to the Popperian view that we 
should test our theories as severely as possible in order to find out their weak- 
nesses. I have already pointed out, that alternative theories may help us in 
suggesting such severe tests. But whether such tests are suggested by alter- 
natives or not, we will always need low-level hypotheses in order to get some 
verdict on our theory under test. This is what Spinner denies. He repeatedly 
stresses as his main point, that ‘the specific difference from [Popperian] mono- 
theoretical falsificationism lies in the fact, that the test-situation in pluralistic 
test-models is a strictly intertheoretical relation’ (p. 164). ‘In the pluralistic 
test-model, theories are only refuted by alternative theories of equal rank’ 
(p. 165, and similarly p. 93). Obviously Spinner thinks that one theory can be 
refuted by another theory directly, without invoking low-level hypotheses 
(basic statements). How this should be done, however, is nowhere made clear. 
I suspect that it has something to do with a mysterious ‘theory of intertheoretical 
relations’ which Spinner thinks to be so desirable (‘... what is needed are 
new theories of intertheoretical relations with multidimensional criteria of 
comparability and complex relational structures’ (p. 171)). but which he never- 
theless fails to develop or even to sketch. 

But whatever these relations may turn out to be—in order to test a theory, we 
need something accepted (even if only provisionally) against which we can test. 
Mere comparison cannot do the job of testing. If this is true—and it seems to 
be so trivially—then Popper is a true pluralist, while Spinner must either 
adopt a dogmatic, utterly unpluralistic test-procedure, or give up testing at all: 
Popper’s approach is, to accept (provisionally) some basic statements (low- 
level hypotheses) against which not only one, but of course a plurality of theories 
can be tested. Spinner does away with basic statements. But then he must 
either adopt some conventionalist strategy and accept one theory which falsifies 
all incompatible alternatives a priori, or—if he refrains from accepting any of 
them—he can only indulge in the sterile exercise of merely comparing theories 
without ever arriving at a verdict. I cannot see a point in either of these options. 
Furthermore, I think it is plainly false, that some theories could ‘in principle 
only be refuted by (alternative) theories’ just as I think that there are no such 
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refuting instances ‘which can in principle only be discovered with the help of 
(alternative) theories’ (compare the quotation above). Of course, alternative 
_ theories may help us in both cases (and they often actually do so); but the view 
that we must have alternatives before we can falsify a theory or uncover some 
falsifying states of affairs is—to put it mildly—somewhat exaggerated. 

One last comment on a difficulty involved in trying to assess Spinner’s book: 
there are many points which might be of tnterest, if Spinner would only manage 
to spell out more clearly what he has in mind with them. Take for instance his 
characterisation of a ‘pluralistic problem situation’ (a central term to his ap- 
proach), which—according to him—is ‘...a plurality of somehow con- 
nected theories, pasitively or negatively related to each other—that is, complexes, 
systems, or series of theories which are linked together among each other- by 
statical or dynamical relations to a metatheoretical problem field’ (p. 167 f.). 
As long as Spinner does not tell us kow these ‘somehow’ related theories are 
‘linked together’, this remains scarcely more than verbiage. 


ALFRED SCHRAMM 
Karl-Frangens- Universität, Graz 
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{10.25 Pp. 225. 


This is an important contribution to a rapidly growing field in the history of 
science. The increasing interest in nineteenth and twentieth century biology 
reflects a general turn toward biology in the public concern with science. We 
want to know more about the origins of modern biology. The nineteenth 
century was the period of transition from natural history to experimental 
biology as the dominant mode of biological research. And during that same 
period the conscious application of biological science to practical problems led 
to a revolution in agriculture and medicine. 

The debate over spontaneous generation is excellently suited to throw light 
on the central theoretical developments in biology during this period, and to 
locate them in a social context. The controversy was, of course, intimately 
linked to the basic theories of life. It reflected the conflicting ideas of what 
life is, or the ‘definitions’ of life, and was therefore full of ideological overtones. 
It was also fired by practical concerns, for instance through the germ. theory 
of disease which was only established after a protracted battle against pro- 
ponents of spontaneous generation. 

The classical accounts of this controversy have looked upon it as a paradigm 
case for the superiority of the experimental method in biology over the observa- 
tional and speculative methods of the older natural history approach. It is 
therefore very pertinent of John Farley now to conduct a new analysis of this 
controversy. No doubt, there are many additions to be made to the traditional 
story, additions that can serve to undermine the hybris of experimental biology. 
But I think that Farley is overdoing the job. His concern with the influence of 
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external social factors in scientific developments and his sympathy with the 
underdogs sometimes obscures the scientific issues. 

Farley covers the whole period from Descartes to Oparin, but the main 
portion of the book is concerned with the nineteenth century. This is where 
the controversy over spontaneous generation properly belongs. Stretching the 
concept of spontaneous generation to cover earlier and later periods is problem- 
atic. The emphasis that Farley gives to investigations on parasitic worms 
seems well justified. Before their alternation between different hosts was dis- 
covered, the occurrence of these parasites must have appeared as strong evidence 
for spontaneous generation. More precisely they were conceived as evidence 
for heterogenesis, the idea that living organisms belonging tæa certain species 
can- arise spontaneously out of organic material which either is part of or has 
been part of some other kind of organism. But in 1842 the Dane Japetus Steen- 
strup published his treatise On the Alternation of Generations which proposed 
an explanation for the appearance of these parasites which was in accordance 
with the principle that all living organisms have parents of the same kind. 
Within a decade anatomical studies and feeding experiments had destroyed 
parasitic worms as an important support for spontaneous generation. 

In the second half of the nineteenth century the central theoretical problem 
of the spontaneous generation debate was abiogenesis rather than heterogenesis. 
This means that new organisms originate from inorganic material and not from 
organic. The idea of abiogenesis drew support from the Darwinian theory of 
evolution which seemed to demand abiogenesis, at least at some distant point 
in the past, to be consistent. This shift in the theoretical problem i is well brought 
out in Farley’s account. 

A. new shift in the problem had occurred when ‘the origin of life’ became a 
much discussed topic in the early decades of the twentieth century. By then 
most scientists were not thinking of living organisms originating at the present 
and not in terms of a sudden origin of even quite simple organisms like bacteria. 
Among those who believed in an origin of living from non-living this was 
mostly thought of as a process that took place in a distant past when conditions 
on Earth were quite different from the present, and as a process in many steps. 
Through the advances made in geology, cell biology and biochemistry one was 
now able to substitute for the imprecise question of how a whole living cell 
could have originated, more precise questions about the possible intermediate 
steps between inorganic chemicals and a simple living cell. 

These shifts in the problem are clearly present in Farley’s account, but he 
could have used them more effectively in the analysis that lies behind his more 
general conclusions. For instance, a clearer recognition of these problem shifts 
would have made Pouchet appear considerably less scientifically rational 
relative to Pasteur, and Oparin less of an influential innovator, than Farley 
claims them to be. It is misleading to say that the view of the nineteen-twenties 
on the abiogenetic origin of life was “essentially the same’ as the earlier views on 
spontaneous generation, both accepting ‘that an entity having all the essential 
characters of life could arise suddenly from matters that lacked these attributes’ 
(p. 155). What H. J. Muller for instance was concerned with was the sudden 
origin of a certain kind of chemical molecule and not of a complete cell. In 
contrast to Pouchet he was quite aware that even a bacterium was much too 
complex to originate suddenly. 
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Oparin’s The Origin of Life from 1936 may have been the best review of the 
problem till then. But Farley does not present any convincing evidence to 
show that it had such a decisive impact on the theories about the origin of life 
as he claims (e.g., pp. 155, 171, 176). On the contrary Farley himself presents 
quite good evidence that the gradualist view, whose introduction he attributes 
to the ‘dialectical materialist’? Oparin, was expressed for instance by E. A. 
Schaefer in 1912 (p. 169). What Oparin did was to sum up views that were 
already widespread among scientists. Farley’s claim that ‘By 1930, the issue of 
spontaneous generation was as confused as it had ever been’, is hardly borne out 
by his own story. One has a feeling that this wholesale denial of progress in 
biology is inspired. by a sympathy for ‘dialectical materialism’ which has not 
been properly checked against the historical facts. 

This analytic weakness in Farley’s book has links to his conception of scientific 
method. One reason why he does not appreciate the objections of Pasteur and 
the French Academy of Science against Pouchet and Bastian is that he shares 
to a considerable extent their belief in simple statements of observational facts 
as the basis of science. The logical problem with which Farley burdens the 


opponents of spontaneous generation is a construction based on naive falstfica- 
tionism : 


Opponents of spontaneous generation, by arguing that all organisms arise 
from parents or, conversely, that no organism arises spontaneously from 
matter, were faced with a logical dilemma in that both these statements 
can be falsified but neither of them can be proven with absolute certainty. 
On the other hand, their opponents, by arguing that some organisms can 
indeed arise directly from matter, held to a belief that cannot be falsified 
but can be proven. Logically speaking, therefore, opponents of spontaneous 
generation could do no more than invalidate particular experiments said 
to illustrate its occurrence. (p. 4) 


There was no such logical asymmetry in the situation. Pasteur and other op- 
ponents of spontaneous generation were building a rival theory, and the pro- 
ponents of spontaneous generation were just as much obliged to falsify their 
experimental results as vice versa. 

On this primitive conception of scientific method Farley bases his claim that 
Pasteur violated ‘the experimental method during his debate with Pouchet 
(p. 115), that no attempt was made to refute the findings of Pouchet’s experiments 
in the Pyrenees (p. 111), and that the jury of the Academy of Science was 
biased and had no excuse for it’s ‘uncritical acclaim’ of Pasteur’s experiments 
in 1864 (p. 111). In the conflict between Bastian and Pasteur one also misses 
an account of what the controversy looked like from the other side. It seems 
quite likely that Bastian’s story about his conflict with the French Academy 
of Science does not appreciate the methodological reasons for their views on 
how the inquiry should be conducted. 

Farley's book is a valuable attempt to vitalise the history of biology. He 
effectively poses some problems about the social relations of biological science, 
bridging the much-debated chasm between external and internal history in an 
unstrained way. In a very readable account he presents interesting facts and 
challenging interpretations of the developments that led to modern biology. 
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It is up to those who disagree with his philosophical messages to produce 
equally readable and more convincing accounts. 


NILS ROLL-HANSEN 
The Norwegian Research Council 
- Jor Science and the Humanities, Oslo 


Brown, Haroub. I. [1977]: Perception, Theory and Commitment: The New 
Philosophy of Sctence. Chicago: Precedent Publishing Inc. $15.95. Pp. 203. 


Perception, Theory and Commitment is an interesting and provocative book 
written in a clear and straightforward way. It contains a criticism of logical 
empiricism and the presentation of an alternative view of science that has much 
in common with Thomas Kuhn’s position. The case against logical empiricism 
is so simple and incisive that defenders of that position will be forced to claim 
that it has been oversimplified or misrepresented. 

Part of the case against logical empiricism and an important component of 
Professor Brown’s own position involves the no longer novel thesis that all 
observation is theory-laden and so does not provide a neutral court of appeal 
for assessing the relative merit of theories. The discussion of that thesis some- 
what overemphasises the psychological aspects of the issue, concerned with 
the fact that the sensations that we experience when observing are influenced 
by our theoretical expectations and so forth, at the expense of more objectivist 
aspects of theory-ladenness. Good experimental situations will be such as to 
minimise the significance of the perceptual experiences of experimenters and 
their judgments about them, so that, as far as possible, the observer’s role is 
confined to such unproblematic activities as the reading of meters or computer 
printouts. The crucial factors then become those concerned with the objective 
nature of the experimental set-up and the procedures carried out. It is the 
theories presupposed by these, and also those involved in the interpretation and 
formulation of the results, that constitute the more significant and deep-seated 
aspects of theory-ladenness, 

The author offers an account of perception according to which, he claims, 
‘the putative gap between theory and observation disappears and the problem of 
psychologism is dissolved’ (p. g1). The claim is not well-founded. It is suggested 
that, as observers, we perceive meanings rather than sense-data or anything of 
that kind. In this way the gap between sensations and observation statements 
is closed. But even if we accept this thesis, which is somewhat inadequately 
defended in the book, it cannot be claimed that the problem of psychologism 
is thereby resolved. From the author’s own standpoint it must be admitted that 
different observers will often perceive different meanings in the same situation, 
so the problem of the justification of meanings and the choice between rival 
meanings remains. Professor Brown certainly does nothing to dissolve that 
problem. 

Two interesting aspects of the author’s own position are intimately connected 
with the theory-ladenness thesis. One is the insistence that an analysis of theory 
change, and of the relationship between a theory and its successor is essentially 
an historical rather than a logical problem. The other is the rejection of the 
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- search for a timeless, universal criterion of adequacy of theories. The author 
insists that criteria of adequacy are internal to a science, change in time, and 
must be evaluated with respect to the theoretical and historical situation. Both 
points are supported by historical examples which, although the details of some 
of them are controversial, suffice to make the author’s case a strong one. 

The account of science defended in the book is essentially a relativist one as 
the author admits. The direction of science is determined by consensus reached 
in a dialectical process of argument and counter-argument. The author is aware 
that a basic problem here is to distinguish legitimate from illegitimate modes 
of reaching a consensus. The account he offers is theoretically weak, relying 
on a few extreme examples of illegitimate interventions in the dialectical process 
such as the persecution of Galileo by the Church and the support of Lysenko 
by the Soviet Union. No indication is given of how more difficult questions are 
to be tackled, such as the status of the concensus reached over the acceptance 
of Aristotelian Cosmology in medieval Europe or the contemporary rejection 
of historical materialism. 

The notion of dialectic adopted by the author is general enough to be applicable 
to Plato’s treatment of justice, to philosophy in general, and to science. Indeed, 
one of the theses supported in the book is that philosophy and science are 
similar activities. This is a weakness. Just as Paul Feyerabend argued against 
Kuhn’s position by demonstrating that the latter’s account of science applies 
equally well to organised crime, so the author argues against his own position 
when he indicates that his own account of science applies equally well to philo- 
sophy. | 

Professor Brown lists three possible responses to his critique of logical 
positivism. We can give up the attempt to construe science as a rational activity, 
we can attempt to develop logical positivism in spite of its problems, attempting 
to overcome them, or we can adopt a relativist theory similar to the author’s 
own. There is certainly one other possibility. We can attempt to give an ob- 
jectivist, non-relativist account of science which construes theory change, not 
in terms of the decisions made by individual scientists or groups of scientists, 
but in terms of the objective properties of theories, the extent to which they 
open up new avenues for development and the extent to which those new 
avenues bear fruit. Such an account would need to be augmented by an account 
of science as a social activity, incorporated into an objectivist theory of social 
structures and social change. 

This book constitutes a concise account of a relativist, consensus theory of 
science of a kind that is currently in vogue but is not usually so clearly formulated 
and defended. As such it is well worth reading. 


A. F. CHALMERS 


University of Sydney 


Monk, H. [1977]: Lectures on Structure and Significance of Science. New York: 
Springer-Verlag. DM 38,20. Pp. xi+-227. 


Following an introductory lecture, lectures 2-6 of the book are concerned with 
the structure of science and the scientific method. These are followed by three 
lectures relating to the biological sciences, the most interesting section of the 
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book. The perspective is broadened in lectures 10-14 to consider the place of 
science in society. The final lecture attempts to draw epistemological conse- 
quences from evolutionary theory. 

The analysis of the structure of science is a traditional, and, in the opinion of 
the reviewer, a somewhat outmoded one. The basis of science is made up of 
true singular propositions based on observation (p. 42). There is a distinction 
between observational and theoretical constructs. Constructs need empirical 
confirmation to become valid constructs. The theoretical and observational 
level are connected by bridge principles (p. 38). If it is to be acceptable an 
hypothesis must be free of contradictions and must offer ‘a satisfactory ‘ex- 
planation’ for those singular propositions whose information content (objective 
knowledge) has been the basis and the motivation for the construction of the 
hypothesis’ (p. 45, italics in original). ‘A good theory is a theory that predicts a 
large number of singular propositions that are accessible to an empirical check 
by observation and experiment’ (p. 47). This positivist account of science has, 
of course, been subject to a sustained attack in recent decades, initiated by the 
likes of Gaston Bachelard and Karl Popper and sustained by Paul Feyerabend, 
Thomas Kuhn and Imre Lakatos. In particular, the distinction between observa- 
tional and theoretical statements has been seriously challenged if not demolished. 
The mere fact that the author advocates a positivist analysis of science is not 
itself a criticism. Had he presented the doctrine in a novel way and defended it 
against some of the criticism to which it has been subjected then the account 
might well have been of great interest. However, no novel defense of positivism 
is offered. Bridge principles and the core of the Hempel-Oppenheim model 
of explanation are accepted as if they were unproblematic. 

Given the positivist bias of the author’s analysis of the structure of science, it 
is surprising to find him, in a lecture on Tradition and Progress in Science, 
defending Kuhn’s conception of paradigms, consensus and the scientific 
community. Professor Mohr embraces Kuhn’s somewhat conservative views 
on the latter, turning a blind eye to those aspects of Kuhn’s writings that pose 
a serious challenge to the view of science he expounds in the earlier lectures. 
That view does make its presence felt during the discussion of Kuhn, however, 
when we are reminded that ‘the decisive point in the evaluation of a paper is 
the validity of the empirical facts and the internal consistency of the argument’ 
(p. 131, italics in original). One of the distinguishing features of Kuhn’s position 
is his denial that such considerations are decisive. 

Since Professor Mohr writes “as a natural scientist with some interest in the 
nature of scientific thought’ (p. ix) one might expect his comments to be of 
most value when concerned with his own field of expertise, namely, the bio- 
logical sciences. ‘This is indeed the case. For instance, the comparison of bio- 
chemistry and physiology and the discussion of the need for the latter to be 
understood as an analysis of systems not to be identified with the sum of their 
elements is interesting and suggestive. However, like most philosophers of 
science, Professor Mohr persists in taking the science of physics at his ideal, 
and so, for example, compares biology unfavourably with physics on the grounds 
that the former has not been axtomatized to anything like the same extent. It 
would possibly have been more interesting had he concentrated on biology 
itself, perhaps stressing the differences between it and physics, thereby chal- 
lenging the extent to which most philosophy of science is modelled on physics. 
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Like much writing in the field, the sections on the social aspects of science 
lack an adequate theoretical perspective. The analysis is largely in terms of the 
attitudes of individuals. Marxists and advocates of the counter culture encourage 
the incorrect attitude. We are given a tentative list of commandments designed 
to encourage the correct attitude in scientists. Admittedly, the typical attitudes 
of scientists in a society and the institutionalised means of fostering those 
attitudes will figure in any analysis of the social aspects of science. However, 
that study cannot be carried out in isolation from an objective characterisation 
of the social structure in which science is practiced. The author is surely correct 
when he criticises defenders of the counter culture for advocating idyllic utopias 
without offering any detailed analysis of how those utopias may be attained or 
worked towards. However, he is guilty of a similar failing when, for example, 
he urges that ‘the experienced scientists must learn to abstain from alliances 
between individual scientists and politicians to push pet programs’ (p. 175). If 
the social set-up is such that alliances of the kind alluded to can lead to economic 
or other advancement for those indulging in them, then the problem must be 
solved by changing the social structure so as to remove those opportunities. To 
hope to solve the problem by focusing exclusively on the attitudes of scientists 
and by urging them to have more high-minded attitudes is itself utopian. 

The author attempts to explain the success of science and, in particular, the 
success of mathematically formulated theories, by an appeal to evolutionary 
theory. The ability to think in terms of logic and mathematics is conducive to 
survival. Humans have that ability built into their genetic code and have sur- 
vived by virtue of that fact. The theory suffers from the old mistake of under- 
standing knowledge as something fundamentally residing in the minds of 
individuals, rather than as an objective structure residing in books, in material 
apparatus and techniques for its use and so on. The author’s assumption is 
very clear when he sets up the dichotomy between the tabula rasa theory of the 
empiricist, according to which all knowledge originates from observations 
made by individuals, and the theory that some knowledge is present at birth in 
the form of inborn structures or innate ideas. We are offered an evolutionary 
turn on the latter. The weakness of the theory becomes apparent when we 
entertain specific questions such as ‘why did abstract theorising about the 
world originate in Ancient Greece?’ or ‘Why did mathematical physics flourish 
from Galileo onwards?’. It would seem that an answer to such questions based 
on evolution in a Darwinism sense is quite inappropriate and an objectivist 
analysis essential. This evolutionary epistemology possesses some disturbing. 
overtones, epitomised in remarks like ‘think predominantly right—and succeed, 
or, think predominantly wrong—and perish’ (p. 204). That remark all too 
readily translates into ‘to be powerful is to be right’ which just will not do. 
The misapplication of Darwin’s theory in the domain of epistemology is remi- 
niscent of the attempt of Henry Adams to apply the thermodynamic phase rule 
to historical change and of attempts to solve the problem of free will using 
Heisenberg’s uncertainty principle. 

This book cannot really be considered to make an important contribution 
to the philosophy of science. 


A. F. CHALMERS 
University of Sydney 
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WESTFALL, RICHARD S. [1977]: The Construction of Modern Science. Cambridge 
University Press. Hardback £7.95. Paperback £2.50. Pp. xvi+-171. 


This is a re-issue of a book first published in the United States in 1971, the 
outcome, as the author states, of lectures given over the preceding seven years. 
In spite of this it is reasonably up to date except for its bibliography. 

The Construction of Modern Sctence has as its main theme that immortalised 
by the late Professor Dijksterhuis in his classic Mechamisation of the World 
Picture (1961) as its chapter titles reveal: ‘Celestial Dynamics and Terrestrial 
Mechanics’, “The Mechanical Philosophy’, ‘Mechanical Science’, ‘Mechanical 
Chemistry’, “Biology and the Mechanical Philosophy’, ‘Organization of the 
Scientific Enterprise’, “The Science of Mechanics’ and ‘Newtonian Dynamics’. 
A subsidiary theme is that which Professor Westfall calls “The Platonic-Pythago- 
rean tradition, which looked on nature in geometric terms, convinced that the 
cosmos was constructed according to the principles of mathematical order’. 
(As he remarks in the Preface, he has given less attention to the Pythagorean 
aspects of this tradition than he would have done had he written all this later.) 
The result is a tightly-woven narrative which hangs together convincingly. It 
is aimed at American undergraduates, and assumes little scientific knowledge. 
This makes it particularly suitable as an introduction for beginners. 

Naturally the author is at his best when dealing with mechanics, terrestrial 
and celestial, and offers many suggestive comments on developments in these 
fields. He is particularly good on assessing the importance for the future of 
Kepler and Galileo who, as he says, opened the fields of celestial dynamics and 
terrestrial mechanics, respectively, or in reviewing the contents of Newton’s 
Principia, which he does in some detail. Necessarily there is some oversimplifi- 
cation. There is no hint that Galileo was not publicly a Copernican before 1610. 
The implication here is that Harvey’s experimental work was mainly with dogs, 
so that the reader is left in ignorance of the importance and elegance of his work 
on cold-blooded animals. It is not true that ‘France led the march of European 
science’ when the Académie Royale des Sciences was organized in 1666. And 
(p. 143) Newton’s experiments on capillary forces were conducted with oil of 
oranges, not orange juice. 

But these are minor blemishes on an otherwise careful, thoughtful and useful 
introduction, clearly written, which explains the main march of European 
science between Galileo and Newton in a most commendable fashion. 


MARIE BOAS HALL 


Imperial College 


STUDDERT-KENNEDY, GERALD [1975]: Evidence and Explanation in Social 
Science. London: Routledge and Kegan Paul. £6.50. Pp. viit+246. 

Kear, RUSSELL and Urry, JOHN [1975]: Social Theory as Science. London: 
Routledge and Kegan Paul. £6.95. Pp. x+-278. 


What is the state of play in the philosophy of the social sciences? It is not easy 
to say. Here we have two volumes, one by a political scientist, which worries at 
the question of how the many kinds of explanation in the social sciences actually 
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connect up with, explain, the evidence; the other, co-written by a sociologist 
and a philosopher, argues for a “realist” interpretation of social science theory. 
So far as I can tell, neither seems really to be a speech in any on-going debate 
in the philosophy of the social sciences. This seems typical of the field at the 
moment. Various and sundry authors beaver away at their own pre-occupations 
without any vision of the field as a whole or their place in it. Hence there is an 
incoherence, a lack of centre to much that is published. There are many possible 
explanations. I would opt for professionalism and lack of scholarship. The 
volumes under review are cases in point. 

One of the many problems that lie behind Studdert-Kennedy’s book is how 
to apply a Kuhnian paradigm theory of science to the social sciences. Do 
Marxist and neo-functionalist accounts of the same society employ incompatible 
paradigms? His argument, so far as it can be discerned, is that they are incom- 
patible paradigms to some extent, yet they are also at different levels and hence 
can be ‘exploited to develop more inclusive and at the same time more penetrating 
interpretations of questions of great importance’ (p. 218). This is done by 
examining several examples at some length. In chapter one, ‘Equilibrium and 
Social Change’, the question is why Rwanda society broke down into massacre. 
An historical account is not detailed enough, some kind of class analysis is 
required. In chapter two, ‘Versions of Structure’, the different versions are 
seen as shifts of emphasis or perspective (p. 49). Chapter three discusses the 
similarities and differences between the natural and the social sciences, arguing 
that there are not two orders of reality or types of understanding, but the 
differences are not simply differences of degree either (p. 56). Chapter four, 
‘Statistical Models and Social Structures’, argues that statistical social science 
research needs theoretical control to identify facts and be relevant. Chapter 
five, ‘Causes and Structure’, says quantitative studies evade the problem of 
causation, while comparative analyses of social structure (‘Rationality and 
Structure’, chapter six) are misleading because they leave out actor-rationality. 
In chapter seven the suggestion is that there are ‘Levels of Theory’. Chapter 
eight, “Theories and Explanations’, offers case studies of Pakistan’s Green 
Revolution and of Balandier’s Marxist view of Africa to point out deficiencies 
and sufficiencies on the basis of the foregoing arguments. Chapter nine returns 
the discussion back to Kuhn. 

If anyone finds this a.rich bill of fare, the book recommends itself. My 
objections to it are that it has no coherent philosophy of science, hence, it seems 
to wander along, not solving any problem, offering any thesis, or engaging in 
any debate; in short, it is unscholarly. Let me elaborate. 

The whole project of applying Kuhn to the social sciences is not grounded. 
Were Studdert-Kennedy to begin with a clear first-order problem in the social 
sciences, show how philosophical problems emerge from it, then show how 
Kuhn’s philosophy solves those problems and so helps with the first-order 
problem, then it would make sense. Social science would be advanced and 
Kuhn’s philosophy would have a concrete argument in its favour. Otherwise 
one has the vaguely textbook/systematic spectacle of an author plucking from 
the air of the philosophy of science a vogue-ish view and trying to work out its 
implications for the social sciences. If one must dó this, why choose Kuhn? He 
is hardly less controversial than alternatives like Feyerabend, Lakatos, Quine, 
or, most important of all, Popper. Studdert-Kennedy offers no rationale for his 
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choice. Kuhn’s philosophy emerged from and is meshed into intricate debates 
in the philosophy of science about change, growth, rationality, knowledge and 
history. None of this is alluded to by Studdert-Kennedy, so his book has no 
philosophical weight. It does not use Kuhn to connect debates about the social 
sciences with debates about the natural sciences, and hence its ‘application’ of 
Kuhn is jejune. 

My explanation is that professionalism is substituted for scholarship. Studdert- 
Kennedy’s book belongs to a curious sub-field: methodological discussions 
within political science. These discussions are curiously out of kilter with the 
philosophical material they derive from, much as are similar discussions in 
psychology. Dimly perceiving that their first-order work is leading them into 
philosophy, political scientists and psychologists turn to that field for guidance. 
Used to looking for findings and results, they try to see what the latest thing in 
philosophy is. Depending on who they talk to or what they dip in to the result 
ig quite often unsatisfactory simply because they have failed to grasp that most 
issues in philosophy are in perennial debate, and that frequent mention of a 
person or idea by no means signals an agreed result or finding. 

It is thus no surprise that a good deal of what is discussed is warmed-over 
positivism. The worry is about operationalising concepts, about using empirical 
concepts and theoretical concepts and devising bridging laws or intermediate 
variables, etc. Such a level of philosophy of science has led to aberrations like 
the study of judicial behaviour as somehow more ‘objective’ than the study 
of the administration of justice. While positivist literature is invoked, and 
positivist vocabulary employed, lately the word has spread about a new sensation 
—Kuhn. The dark presence, the skeleton in the closet, only noticed with a 
latter-day (1976) prize for his 1945 masterpiece The Open Society and Its 
Enemies, yet whose jargon is everywhere, is Sir Karl Popper. Since he is at the 
centre of many current debates in the philosophy of science, why is he not 
confronted by spectators like Studdert-Kennedy? The answer is obvious. 
Positivism, Kuhn, even Lakatos, can be used (whether understood or misunder- 
stood) to legitimate present practice in political science, all that is involved 
is a re-ordering and a re-description of the work in a new vocabulary. Popper’s 
philosophy is far more radical, demanding as it does explicit debate about 
clearly stated problems, and being extremely sceptical of all formal and natural- 
istic methodology. He offers no formulas for legitimate science, only conventions 
for sharpening and improving debate. Hence Popper’s philosophy stems from 
and returns to the first-order level, where grave deficiencies are liable to lurk, 
having brought about the philosophic turn in the first place. 

Even more striking is Social Theory as Science, which scarcely mentions 
Popper, yet makes no sense without his jargon, and which fails to notice that 
he has already said much of what the authors’ want to advocate. The authors 
are a philosopher trained by Harré (Russell Keat) and a Cambridge-trained 
economist-turned sociologist (John Urry), who became friends during the 
‘political conflict’ at Lancaster University and decided to write a book ‘to change 
how we think about and carry out the scientific study of social life’. The problem 
on which they want to change thinking is that of the methodological unity of 
(or continuity between) the natural and the social sciences—the classical prob- 
lem, in other words, of Mill, Weber, Hayek and Popper. Their thesis is that 
pro-naturalism is correct, and their heroes are (the later) Marx and Weber; the 
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former for his critique of political economy; the latter for his The Protestant 
Ethic and the Spirit of Capitalism. Marx needs a bit of rescue work because he 
sounds rather reductionist and deterministic at times, and because there are 
Hegelian glosses on him. Weber needs rescue work because he flirts with 
idealism. 

The rescue is effected by the ‘realist’ (1.e. Harré’s) philosophy of science, 
which detaches ‘positivism’ from naturalism, and disposes of conventionalism. 
To accomplish its task the book argues as follows. There are three main philo- 
sophies of science, positivism, realism and conventionalism (part I). Sociology 
has traditionally been positivist (part II, chapter four). Marx was a realist 
(part II, chapter five). Some structuralists are (Chomsky) and some are not’ 
(Lévi-Strauss) realists (part II, chapter six). Weber is a difficult case because 
he is neither positivist nor realist: he believes in using interpretive understanding 
on ideal types. This can be resolved by declaring interpretative understanding 
to be involved in natural science, and by treating agent’s reasons as causes (part 
ITI, chapter seven). The discussion of ideal types is inconclusive and peters 
out in a look at that joint derivative from Marx and Weber, ‘Critical Theory’ 
(part III, chapter nine). Meanwhile realism has been defended against the 
Marxist charge of ideology or reification (part III, chapter eight). 

From this summary it will be apparent that the book is a very odd one, 
although not because it is trying to change our thinking. Keat and Urry offer 
us precious little reason to follow them in the exciting quest of applying Harré’s 
philosophy of realism to Marx, structuralism, Weber, and Critical Theory. 
They could just as soon have chosen Kuhn, Feyerabend, Lakatos, even Popper. 
Especially Popper, since it is hard to see how their ‘realism’, differs from Popper’s 
‘modified essentialism’, which also allows that the aim of science is to give deep 
structural, causal accounts of the way the world is. Popper also goes on to try 
to explicate such notions as simplicity and depth, which Keat and Urry never do. 

Moreover, and again like Studdert-Kennedy’s, their book is much too 
second-order. In opposing the philosophies of positivism and conventionalism 
it is not enough simply to offer the standard philosophical objections. What 
counts in the methodology of the social sciences is that they are poor methods. 
Their poverty must be shown. 

Marx, Weber and Chomsky are formidable heroes, but to say that their ideas 
can be given a ‘realist’ interpretation is an extremely weak argument for them 
or for realism. It deserves to be remembered that positivism grows from an 
attempt to make a radical break with tradition and what were thought to be its 
errors; conventionaliam was in turn an attempt to break with the imperialistic 
ambitions of empiricism-positivism. Where does ‘realism’ stand on the problem 
of knowledge and tradition? 

Professionalism is the display of skill without the attempt to give general 
consideration to the project at hand. It is most often displayed in discussion 
notes in journals where authors controvert the details of one author’s arguments 
in a context that is understood. Books on methodology that read like extended 
discussion notes are unintelligible. There is no given context of debate that can 
be taken as understood. There are however a few basic authors in relation to 
which book-length studies really must be mapped. 

And this connects up with the problem of scholarship. Above all the scholar 
in question is Popper. The Poverty of Historicism is the locus classicus on whether 
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the social sciences can or should ape the natural sciences. It is on the whole 

pro-naturalistic, pro-causal, ‘realistic’; Marx is one of Popper’s heroes. What 

excuse is there for a book like Social Science as Theory to evade discussion of all 
this? 

I. C. JARVIE 

York University, Toronto 
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The Society for Exact Philosophy announces a call for papers to be presented 
at its 7th meeting in Montreal on 3 to 5 June 1979. The papers may be on any 
philosophical subject provided it is treated with the help of some logical or 
mathematical tools. They can be up to 20 pages long but the reading time will 
be limited to 20 minutes. The papers should be submitted in triplicate with an 
abstract of about 100 words. The name of the author, with his or her address 
and the title of the paper, should be submitted on a separate sheet. The sub- 
mission deadline is x April 1979, and the papers should be sent to the conference 
organiser, Mario Bunge, Foundations and Philosophy of Science Unit, McGill 
University, 3479 Peel Street, Montreal H3A 1W7, Canada. 

In addition to contributed papers the following lectures will be delivered at the 
meeting. David Braybrooke (Philosophy, Dalhousie), ‘Formal aspects of re- 
conciling attention to needs with attention to preference’; Eugenio Bulygin (Law 
School, Universidad de La Plata), ‘Permissions and permissive norms’; Mario 
Bunge (Philosophy, McGill), ‘Towards a formalization of the psychoneural 
identity theory’; Nancy Cartwright (Philosophy, Stanford), title to be announced; 
Lucio Chiaraviglio and Albert N. Bandre (Information and Computer Science, 
Georgia Institute of Technology), ‘Action and behavior’; L. Jonathan Cohen 
(Queen’s College, Oxford), ‘Assessing the reliability of medical diagnoses’; 
Zoltan Domotor (Philosophy, Pennsylvania), ‘Relationships between macro- 
theories and microtheories’; William S. Hatcher (Mathematics, Université 
Laval), ‘Platonism and pragmatism’; Lorenz Krüger and Wulf Gaertner 
(Philosophy, Universität Bielefeld), ‘Self-consistent libertarianism’; Joachim 
Lambek (Mathematics, McGill), ‘Intuitionist type theory in foundations’; 
Klemens Szaniawski (Philosophy, Warsaw University), ‘Science as an 
information-seeking process’; Roberto Torretti (Philosophy, Universidad de 
Puerto Rico), ‘Mathematics and ontology’, and Bas C. van Fraassen (Philosophy, 
Toronto and Southern California), ‘Belief and context’. 

The officers of the Society for the 1978-80 period are Dorothy Grover, 
President; Hugues Leblanc, Vice-President; Stephen K. ‘Thomason, Secretary, 
and Nuel Belnap, Treasurer. 
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Absolute Space and Conventionalism 
by DAVID ZARET 


x In this paper, I will examine the logic of Newton’s ‘rotating globe’ 
experiment in the context of a four-dimensional formulation of Newtonian 
theory. I will proceed by outlining a number of alternative global struc- 
tures for Newtonian space-time. Corresponding to each structure is an 
appropriate Newtonian theory; namely, a theory whose postulates are 
appropriate mathematical variants of the postulates of Newtonian 
mechanics and gravitation, and which posits the given structure as the 
structure for space-time. I will argue that if we adopt a conventionalist 
attitude towards these theories, then we can consistently maintain that 
the basic principles of Newtonian theory are true, and yet deny the 
Newtonian’s claim that certain observable phenomena must be interpreted 
in terms of acceleration relative to-some non-material spatial entity. 
While I will focus on Newton’s experiment of the rotating globes, I 
believe that my arguments also apply to his bucket experiment. 

I will begin my discussion, in section 2, by presenting three different 
ways of formulating the structure of Newtonian space-time. In section 3, 
I will contrast the ways in which the realist and the conventionalist view 
the relative ontological status of these different structures. In section 4, 
I will examine Newton’s ‘rotating globe’ experiment, with particular 
emphasis on the difficulties which this experiment raises for the con- 
ventionalist. Finally, in section 5, I will try to show how the conventionalist 
can deal with these difficulties by using the results developed in sections 
.2 and 3. 


2 Newton’s remarks in the Schokum to his definitions in the Principia 
suggest that space-time consists of ‘absolute space persisting through 
absolute time’. A space-time described in this way can be thought of as 
having a direct product structure Spacex Time, where Space is RÌ, 
Euclidean three-space, and Time is represented by R, the real numbers. 
For any pair of points in this space-time, there is a uniquely defined 
spatial separation and a uniquely defined temporal separation. I will 
refer to a space-time with the structure R? x R as ‘Newton’s space-time’, N. 

According to a second formulation, the structure of ‘flat Newtonian, 
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space-time’, F, is as follows.1 F is a four-dimensional differentiable 
manifold homeomorphic to R*. We can define a function t on F which 
represents ‘absolute time’. The subspaces ¢= constant are three- 
dimensional manifolds, homeomorphic to A3; and through each point of F 
there passes exactly one such subspace. These subspaces are to be inter- 
preted as the planes of absolute simultaneity, or ‘instantaneous spaces’ of 
F. F is also endowed with a contravariant tensor g*’, of signature o-+-+-+, 
which satisfies gt, = o (where t, = 4,t/02").* This last condition implies 
that g** cannot be interpreted as a non-singular metric tensor for F; 
instead, it is to be interpreted as the metric tensor for the instantaneous 
spaces of F. In contrast to N, therefore, it is not the case that there is a 
well defined spatial separation for every two points of F. Instead, a unique 
spatial separation is defined only for pairs of points which lie in the same 
instantaneous space. Thus on this second formulation, it is no longer the 
case that space-time consists of ‘space persisting through time’. Finally, 


* + f * 
F possesses a flat symmetric affine connection V, with components I;,, 
which satisfies: 


f J 
(1) Vita = 0. This ensures that ¢ is an affine parameter of V. 


f 
(2) V,g® = o. This ensures that the length of spatial vectors is preserved 
f 
under parallel transport by V. 


The equations for the geodesics of V are given by 
(3) d'a*/dt*+-I"p,(dz?/dt)(d27/dt) = o. 


Since y is flat, we can find a coordinate system in which its components 
vanish. In such an inertial coordinate system, the geodesic equations (3) 
reduce to 
(4) d*2*/dt® = o. 

Now to say that (3) and (4) are geodesic equations is to say that they are 
equations of motion for a particle which is not acted upon by any external 
forces. Hence (4) implies that a particle which is not acted upon by any 
force performs uniform straight-line motion with respect to an inertial 
system. 

' According to a third formulation, the structure of ‘curved Newtonian 


space-time’, C, is derived from the structure of F as follows. First, the 
four-accelerations in F are defined by 


(5) a? = df d*a? (di? + I'},(ds"/dt)(dx*/dt). 


1 In my presentation in section 2, I have followed Havas [1964] and Trautman [1965], 
[1967]. See also Earman and Friedman [1973]. 

? Greek indices range and sum from o to 3; Latin indices range and sum from 1 to 3. 
Local coordinates are (2°, 2", 2%, 33), 
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Thus Newton’s second law reads, 
(6) ma? = F°. 
In particular, the gravitational force is given by 
(7) F? = mg®1U,, where ‘;’ denotes covariant differentiation with 


f 
respect to Y. 
Here U is the gravitational potential, which satisfies 


(8) gU; s; y = —4np. 
Now from (5), (6), and (7), we can rewrite the equations of motion 
as 


(9) da*/dt*+-(Ipy—tgtyg” U; 5)(da*/dt)(d27/dt) = o. 


č 
Following Havas, we can use (9) to define a new affine connection V, with 
components 


(10) gy = I"g,—tpt,g” U, à 


In order to construct C, we take y rather than v to be the affine con- 
nection for space-time. Thus (9) is seen to be in the form of a standard 
geodesic law for C. In particular, (9) is to be interpreted as the law of 
motion for a particle under the influence of inertial forces only ; gravitational 
and inertial forces have been identified in C, just as in general relativity. 
C does retain the temporal and spatial metric structures of F, and it also 
retains the “‘stratification” of space-time into instantaneous spaces. But 


since y rather than y is the affine connection for C, C and F differ with 
respect to global structure. Specifically, when gravitational sources are 
present, there is no global coordinate system in which all of the com- 
ponents 4%, vanish. That is, C, unlike F, is a ‘curved’ space-time. 


3 Let ‘N® denote Newtonian mechanics and gravitation formulated 
in the context of the space-time N. Thus N* is the usual, three-dimen- 
sional formulation of Newtonian theory. Similarly, let ‘F* and ‘CP 
denote Newtonian mechanics and gravitation formulated in the context 
of the space-times F and C, respectively. Such four-dimensional formula- 
tions have been presented by Havas [1964] and by Earman and Friedman 
[1973]. What can we say about the relative status of N*, F*, and C*? 
According to the realist, the transitions from N* to F*, and from F# to C*, 
involve a straightforward shift in our ontological commitments. Thus in 
the transition from N* to F%, absolute space in the Newtonian sense is 
replaced by a non-denumerable infinity of instantaneous spaces, together 


i 
with a flat connection, V which enables us to regard these spaces as part 
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of the space-time F. And in the transition from F* to C*, the flat space- 
time F, in which gravitational forces exist, is replaced by the curved 
space-time C, in which gravitational forces do not exist. In particular, a 
freely falling particle is interpreted in F* as being deflected from its 
‘natural’ straight line motion by gravitational forces, while such a particle 
is interpreted in C* as travelling along a geodesic. Hence a freely falling 
particle is viewed in C* as not being subject to the action of any external 
forces. 

According to the conventionalist, on the other hand, these transitions 
do not involve any corresponding shift in our ontological commitments. 
Instead, they involve only a shift in how we have decided to describe the 
same physical state of affairs. Of course, in order that the conventionalist 
position be tenable, it is necessary that there be no observable difference 
between the theories N*, F%, and C*. Specifically, it is necessary that 
N*, F*, and C* be equivalent with respect to all possible observational 
evidence. Now in general, any attempt to establish observational equiva- 
lence in this strong sense is unlikely to be successful. For there are serious 
difficulties involved in any attempt to state precisely just what the differ- 
ence is between an observational entity and a theoretical entity. Hence 
it is difficult, if not impossible, to demarcate precisely and finally the 
boundary between the observational and the theoretical. However, it is 
not so clear that these difficulties pose any problem for the conventionalist 
who is concerned specifically with the relative status of N*, F*, and C*. 
For these theories differ only with respect to the questions of whether 
space-time is ‘stratified’ into instantaneous spaces, and which trajectories 
qualify as space-time geodesics. And I think it is clear that, according to 
any reasonable interpretation of ‘observable’, these questions cannot be 
settled by recourse to observable phenomena. In particular, it seems very 
unlikely that any modification in what counts as observable which is 
brought about by theoretical and/or technological progress could persuade 
us to regard instantaneous spaces or the property of ‘being geodetic’ as 
observable. 

In more detail, the conventionalist might attempt to establish observa- 
tional equivalence by arguing as follows. First, consider the transition 
from N* to F*. In N*, we are given absolute time, absolute space, and the 
standard axioms for Newtonian mechanics and gravitation. In F*, we 
are given absolute time, the flat instantaneous spaces, and the Newtonian 
axioms which have been rewritten in covariant, four-dimensional form. 
But it is impossible to discover by direct observation whether our universe 
consists of a single space persisting through time, or consists instead of a 
non-denumerable infinity of instantaneous spaces. For all that we can 
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actually observe are the objects which exist ‘in spacé’; we do not directly 
observe either absolute space or the instantaneous spaces. N* and F* admit 
the very same objects, following the very same trajectories. The only 
difference is that, in the one case, we say that these objects exist and move 
in a space which persists through time; while in the other case, we 
describe these objects and their trajectories in terms of the instantaneous 


spaces and the connection Ÿ. And this is certainly not an observable 
difference. 

Now if we take as our starting point the space-time F, there is no way 
of deriving the structure of any particular version of N, where different 
choices of which inertial frame represents the rest frame of absolute 
space correspond to different versions of N. This is because no inertial 
frame is ‘privileged’ from the standpoint of F*, For my purposes, however, 
it will be sufficient to assume that we have been given a particular version 
of N. That is, I assume that we have been given that a particular inertial 


frame, as defined by Ý, represents the rest frame for absolute space. Under 
this assumption, we can derive N* from F* simply by stipulating that the 
given inertial frame is, indeed, the rest frame for absolute space, and then 
rewriting the covariant axioms of F* in three-dimensional form. 

Next, consider how C* can be ‘derived’ from F*. First, as indicated 
in equations (g) and (10), the structure of C can be derived from that of 


F by “packing” the gravitational potential into the connection y. Second, 
suppose we are given the axioms of Newtonian mechanics and gravitation 
as formulated in F*. We then stipulate that freely falling particles follow 
geodesic paths in space-time. From the given axioms, together with this 
stipulation, it is a straightforward matter to write down the axioms for 
C*.1 Hence the actual motions of objects are the same in F* and C*. 
What differs, in the two theories, is what we say about these motions. In 
the one case, we say that a freely falling particle has been ‘deflected’ from 
its geodesic path by gravitational forces; in the other case, we say that the 
same particle is following a geodesic path, and is not subject to the action 
of any external forces. But again, this is not an observable difference. 

As was the case with N and F, there are some complications involved 
in any attempt to derive the structure of F from that of C. For there is no 


f 
unique way of ‘unpacking’ the connection V, together with the gravita- 
tional potential, from y. Instead, there are an infinite number of ways in 


c 
which the decomposition of V into flat part plus gravitational remainder 
1 ‘This procedure is partially outlined by Misner, Thorne and Wheeler [1973], chapter 12. 
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can be achieved.1 Hence if we take the space-time C as our starting point, 
there is no way of deriving any particular version of F, where different 
versions of F correspond to different flat connections. Again, however, 
it will be sufficient to assume that we have been given a particular version 
of F. That is, I assume that we have been given a particular flat connection 


Ý, and a corresponding gravitational potential. Under this assumption, 
it becomes a trivial matter to derive the structure of F, together with the 
theory F*, from C and C#, respectively. 

For the conventionalist, the fact that N*, F*, and C* are interderivable 
in the way just discussed implies that these ‘alternative’ theories do not 
represent genuine alternatives at all. Instead, they represent factually 
equivalent descriptions of physical reality; that is, they represent different 
expressions of the same theory. In particular, then, the conventionalist 
concludes that there is no fact of the matter as to whether N*, F*, or C* 
is correct. This conclusion has the following metaphysical implication: 
the possible universes N, F, and C, which the realist distinguishes, 
collapse into one universe. For example, in discussing the relative status 
of F* and C*, the conventionalist would argue as follows: the only things 
which exist are material bodies which behave in certain well defined ways, 
and whose behaviour can be described equally ‘correctly’ in terms of 
gravitational forces (as in F#), or in terms of the geometry of space-time 
(as in C*). 

Now this conclusion appears to be tenable only if we place some limita- 
tions on what phenomena can qualify as ‘factual’. For example, F# and C* 
do disagree on the questions of which trajectories count as geodesics, and 
whether freely falling bodies are subject to external forces. Hence if 
questions regarding geodesics and (gravitational) forces do qualify as 
having factual content, F* and C* are clearly not equivalent. Therefore, 
if the conventionalist’s position is to be acceptable, he must deny that 
these questions involve the factual content of Newtonian theory. But this 
last claim must be justified. 

One could justify this claim, of course, by adopting a general reductionist 
position, according to which the only things which exist are observable 
things and observable properties of observable things. Specifically, the 
conventionalist might adopt a principle along the lines of 


(R) The factual content of a theory is exhausted by what the theory 
has to say about phenomena which can be directly and locally 
observed. 

When the conventionalist bases his argument on (A), then his conclusions 
1 Trautman [1967], pp. 417-18. 
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regarding the relative status of N*, F*, and C* do follow from the observa- 
tional equivalence of these theories. However, it then becomes incumbent 
upon the conventionalist to justify (R). The issues involved in an attempted 
justification of (R) are just the issues which arise from a general reduc- 
tionist critique, and will not be discussed in this paper. Instead of discuss- 
ing these issues, I will simply assume that the conventionalist has adopted 
a principle such as (R), and will examine the viability of his position in 
light of Newton’s famous thought experiment. 


4 Newton invites us to imagine a universe in which nothing exists 
except for two spheres connected by a rope. He argues that these spheres 
either are, or are not in rotation about their common center. If they are 
in rotation, there will be tension in the rope joining them; if they are not 
in rotation, the rope will be slack. Hence Newton argues, 
(11) Suppose there is tension in the rope. 
(12) This tension is an indication of centrifugal force effects. (Newtonian 
mechanics) 
(13) All centrifugal force effects are induced by rotational motion. 
(Newtonian mechanics) 
.(14) The spheres are rotating. (11, 12, 13) 
(15) The spheres are not rotating relative to one another. 
..(16) The spheres are rotating with respect to absolute space. (14, 15) 


Hugh Lacey argues that Newton’s thought experiment does establish 
that absolute space plays an essential explanatory role in Newtonian 
theory. 

Since there is no relative motion, according to the relationist there is no motion 
at all—and without a motion, there cannot be a tension in the cord (unless 
Newton’s laws of motion, as well as his interpretation of them, are wrong), 
and there cannot be any variation of tension. For the absolutist the existence of 
tension would be a symptom of motion; for the relationist there cannot be a 


tension in the cord. If we could produce the two-globe universe, we would 
have a potential empirical test between the two theories.! 


Lacey makes two points in this passage which are directly pertinent to the 
issues being discussed in this paper. 

(i) If the relationist accepts (11) and (12), but denies the conclusion 
of the argument by rejecting (13), then his position is inconsistent 
with Newtonian mechanics. 

(it) There is a potential empirical test between the relationist and 
absolutist theories. 


1 Lacey [1970], pp. 330-1. 
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Claim (i) does appear to be correct, since (13) is a fundamental principle 
of Newtonian mechanics. However, I believe that there are several ways 
in which the relationist might attempt to deal with the difficulties sug- 
gested by (t). First there is at least one reply which a supporter of Mach’s 
position might make to (1). Thus the Machian might note that the principle, 
‘all centrifugal force effects are induced by rotational motion’, has been 
confirmed by actual observations made in our world. And he might argue 
that there is no compelling reason to suppose that laws which have been 
thus inductively confirmed in our world, would continue to hold in a 
universe in which nothing exists except for two spheres and a rope (or a 
bucket of water). Indeed, Mach himself held a much stronger position 
than this, as is evidenced by the following passage. 


All our principles of mechanics are, as we have shown in detail, experimental 
knowledge concerning the relative positions and motions of bodies. Even in the 
provinces in which they are now recognised as valid, they could not, and were 
not, admitted without previously being subjected to experimental tests. No one 
is warranted in extending these principles beyond the boundaries of experience. 
In fact, such an extension is meaningless, as no one possesses the requisite 
knowledge to make use of it.1 


Thus according to Mach, scientific hypotheses are not admissible unless 
they actually can be tested experimentally. And scientific principles 
cannot be regarded as established unless they actually have been tested 
experimentally. Hence on Mach’s view, scientific principles do not express 
physical necessity; they are not ‘laws of nature’ which support contrary- 
to-fact conditionals. Now Newton’s imaginary universe is not physically 
realisable; we cannot actually annihilate all the matter in the universe 
except for a single, isolated system. Therefore, if we follow Mach in 
regarding as admissible only those hypotheses which actually can be 
tested in sense experience, it follows that hypotheses concerning what 
would happen in these universes are not admissible. In other words, Mach 
would regard the assertion that the principles of Newtonian mechanics 
would continue to hold in the two sphere universe as not only empirically 
unwarranted, but meaningless. 

If the relationist does follow Mach in refusing to admit that empirically 
established principles can be ‘extended beyond the boundaries of experi- 
ence’ then he can consistently maintain that (t) is incorrect. For according 
to Mach’s position, it is perfectly consistent to agree with Newton that, 
in our world, ‘all centrifugal force effects are induced by rotational motion’; 


+ Mach [1919], p. 229. Lacey [1970] also cites this passage, and draws a contrast similar 
to the one I have drawn between Machian and Newtonian approaches to scientific 
explanation. 
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and yet deny that this principle expresses any kind of nomic universality. 
Hence the relationist can agree that Newtonian mechanics is applicable 
to our world, with the proviso that absolute space be replaced by the 
‘fixed stars’; and at the same time, he can deny that Newtonian mechanics 
is applicable to the two sphere universe. 

However, it is important to question the legitimacy of a Machian 
approach. For in contrast to this approach, Newton has subsumed. both 
the actual world and various physically possible worlds under one inclusive 
theory. And one might argue that, because Newton has adopted the view 
that scientific laws should support contrary-to-fact conditionals, New- 
tonian theory is preferable to a Machian approach. In other words, one 
might subscribe to the view, held by Newton and many others, that physical 
laws do express physical necessity; they do apply to such physically 
possible worlds as the two sphere universe. And one might argue, further, 
that the capacity to explain all physical phenomena, without regard to 
such contingent features of the universe as the number and configuration 
of the objects it contains, is an ideal to which scientific theorising should 
strive to conform. 

I do feel that we should require that the principles of any ‘good’ theory 
be lawlike; that is, we should require that these principles cover counter- 
factual cases which are physically possible. And in what follows, I will 
regard this requirement as a necessary condition for the adequacy of a 
physical theory. It is in this context that (f) can be considered to be at least 
partially correct. For while it is consistent, according to a Machian 
approach, to accept Newtonian mechanics but deny the applicability of 
(13) to the two sphere universe, this approach is inconsistent with the 
requirement that the principles of Newtonian mechanics be lawlike. 
Hence I will interpret the argument suggested in (t) as establishing that a 
Machian approach is unacceptable. 

Although a Machian approach is unacceptable, there are alternative 
approaches available to the relationist. Now the relationist cannot admit 
that the only way in which he can consistently reject Newton’s conclusion 
(16) is by claiming that (13) is simply false. For if the relationist can deny 
the existence of absolute space only by denying such principles as the 
one stated in (13), and hence by denying the correctness of Newtonian 
mechanics, it seems to follow that just as one cannot have, say, molecular 
theory without molecules, so one cannot have Newtonian theory without 
absolute space. And this result is unacceptable to the relationist who 
bases his critique of absolute space on reductionist principles. 

To see why this is so, first observe that there is one interpretation of (13) 
under which the reductionist must view it as false. This is the usual, 
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Newtonian interpretation, according to which talk of centrifugal force 
effects does commit us to the existence of impressed forces. Since the 
reductionist denies the existence of any forces, he must regard (13) as 
false under this interpretation. However, it is certainly an empirical fact 
that there is a correlation, in our world, between accelerations and the 
occurrence of those phenomena which are called “force effects”. Hence 
if we interpret the term “centrifugal force effect” as referring, not to 
‘effects-caused-by-forces’, but rather to a certain class of observable 
phenomena which are ordinarily associated with rotational motion (such 
as the change in the shape of the surface of the water in a rotating bucket), 
then it is consistent for the reductionist to regard (13) as meaningful. 
For, on this second interpretation, (13) does refer exclusively to events 
which take place on an observational level, and hence is falsifiable. 
similarly, on this second interpretation, (12) does not tell us that forces 
are acting upon the two spheres; it tells us only that the observed tension 
in the rope is one of those phenomena which are ordinarily associated 
with rotational motion. 

When we construe the argument (11)-(16) in accordance with this 
interpretation, then I believe that the reductionist cannot reject that 
argument on the grounds that it incorporates theoretical assumptions 
into its premises. ‘Therefore, if the reductionist’s only means of rejecting 
Newton’s conclusion (16) were by showing that one of (11)-(15) is false, 
he would have to show that one of these premises is false as a matter of 
empirical fact. In particular, let us suppose that the reductionist can reject 
(16) only by establishing that (13) is false. Then the only pertinent 
alternatives seem to be: (13) is false and it is consistent to maintain that 
absolute space does not exist; or (13) is true and absolute space does exist. 
If these are the only alternatives, then it follows that, even for the reduc- 
tionist, the existence of absolute space is a matter of empirical fact. That 
is, it follows that it is at least intelligible to suppose that absolute space 
exists. 

In order to avoid these consequences, the reductionist cannot regard 
(13) as simply false. Instead, he must agree that there is a lawlike con- 
nection between force-effects and accelerations. At the same time, in 
order to avoid Newton’s conclusion (16), he must argue that (13) is not 
literally true. In bare outline, such an argument might run as follows: 
‘While it is true that “all centrifugal force effects are induced by rotational 
motion”, when we say “rotational motion”, we don’t mean “rotational 
motion”. This, of course, is extremely vague, but I will try to make it 
more precise later. For now, I merely wish to emphasise the following: 
what Lacey’s claim (ï) does establish is that, if the reductionist is to have 
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any hope of success, he cannot flatly reject Newtonian theory. Instead, he 
must reinterpret Newtonian theory in terms of some alternative to absolute 
space. And, given the requirement that the principles of Newtonian 
mechanics be lawlike, he must show that one such alternative theory is 
adequate to explain events both in our world and in various physically 
possible worlds. 

If Lacey’s claim (i) were correct, then the question at issue in this 
paper would be settled. For as was just argued, to say that we can imagine 
an empirical test which would decide between the relationist and 
absolutist theories, is to imply that each of these theories is at least 
intelligible. In particular, it is to imply that the existence of absolute 
space is not a matter of convention, but a matter of empirical fact. How- 
ever, I do not feel that (tt) is correct. Now I do agree with Lacey that it is 
perfectly intelligible to suppose that a two sphere universe exists in which 
there is tension in the rope. Hence if, in denying the intelligibility of 
absolute space, the relationist were required as well to deny the possibility 
of tension in the rope, I do not think that his position would be tenable. 
But I do not agree that the presence of tension in the rope is incompatible 
with the relationist’s position. What is incompatible with the relationist’s 
position is, of course, the assumption that there could be any motion in 
the two sphere universe. But again, the relationist can simply deny that, 
in the two sphere universe, all force effects are accompanied by accelerated 
motion, Indeed, since he does not admit the possibility of motion in such 
circumstances, the relationist is required to give a non-Newtonian account 
of forces in the two sphere universe. Or at least, since he may not even 
recognise the existence of forces, the relationist is required to give a non- 
Newtonian account of events in this universe. I will consider below the 
question of just what such an account would involve. But assuming, for 
now, that such an account can be given, it is clear that the presence of 
tension in the rope cannot decide between the two theories. Instead, we 
would have two logically incompatible theories, each consistent with the 
empirical data (namely, tension in the rope). 


5 In order to see how an alternative to the Newtonian account would 
proceed, it will be helpful to recall the space-times N, F, and C which 
were discussed earlier. For what I would like to argue is that Newton’s 
thought experiment is not strong enough to determine the structure of 
space. Thus while Newton probably had the structure N in mind when 
he developed his experiment, it is also consistent for us to accept each of 
the steps (11}-(15), and yet insist that ‘space’ derives its structure from 
the space-time F, or from the space-time C. 
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The reason for this underdetermination is as follows. First, let us 
stipulate that a system is ‘experiencing absolute rotation’ just in case it 1s 
rotating, but is not rotating with respect to any material system. Then it 
does follow directly from (14) and (15) that 

(16’) The spheres are experiencing absolute rotation. 

For by (14), the spheres are rotating; and by (z5), since there is no material 
system other than that comprised by the two spheres and the rope, this 
rotation cannot be defined with respect to any material system. 

Now to establish (16’) is not quite to establish (16). For in order to 
derive (16) from (16’) we must show that if a system is rotating absolutely, 
then it is rotating with respect to absolute space. In other words, we must 
show that if something is rotating absolutely, then it is experiencing an 
absolute change of position. However, it may seem that this latter deriva- 
tion is a straightforward one. For to say that a system is rotating is just 
to say that it has a non-negative angular velocity; that is, it is experiencing 
an angular change of position. Hence (16’) tells us that the spheres are 
in motion. But again, since there is no material system other than that 
comprised by the two spheres and the rope, this motion cannot be defined 
with respect to any material system. And to say that something is moving, 
but is not moving with respect to any material system, is to say that it is 
experiencing an absolute change of position; that is, it is to say that it is 
moving in absolute space. Thus we can apparently derive (16) from (16’) 
by arguing from absolute rotation to absolute change of position to absolute 
space. 

Underlying this argument is the notion that one cannot define accelera- 
tion without reference to the notion of velocity. For ‘acceleration’ means 
‘change in velocity per unit of time’. And a system cannot very well 
experience a change in velocity unless it has a velocity in the first place. 
Therefore, according to this general argument, if a system has an absolute 
acceleration (or is absolutely rotating), it must have an absolute velocity 
as well. However, this argument is fallacious; for it is possible to define 
absolute spatial acceleration without reference to a notion of absolute 
change of position or absolute spatial velocity.1 For example, consider 
the space-time F. We have seen that, if two points of F do not lie in the 
same hypersurface t= constant, their spatial separation is not well 
defined. It follows that spatial velocity is not well defined in F. More 
precisely, let x* = 2*(#) be the space-time trajectory of a particle in F. 


The velocity of the particle is represented by the vector v with unit time 


1 This point is made in some detail by Stein [1967], pp. 182-4; and by Earman [1970], 
pp. 296-7. 
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component, which is tangent to the curve s*(¢): o* = dz*/dt. Now the 
spatial velocity of the particle is represented by the vector whose com- 


ponents are the spatial components of v; but again, this is not well defined 


in F, because there is no unique way of projecting © onto a surface of 
simultaneity. But now consider the acceleration of the particle. The 


components of the instantaneous acceleration vector a are given by equation 


f f 
(5), above. Alternatively, we may write a* = of V y0". Hence a*t, = (v°V,0")t, 


f f 
= (by the Leibniz rule) v°V,(0*t,)—v'0(V,t,). But the first term here 
vanishes because o"t, = 1; and the second term vanishes because, by 


J -+ 
equation (1), Vif, = 0. Therefore, a*t, = o. Thus a is always tangent to 


a surface of simultaneity; that is, a is a spatial vector. Given that aisa 
spatial vector, it follows that it has a well defined spatial length in F. This 
length represents the magnitude of the instantaneous acceleration of the 
particle in question. A similar argument establishes that spatial acceleration 
(but not spatial velocity) is well defined in C.1 

This argument shows that (16) does not follow from (16’). For since 
we can define an acceleration, or velocity-difference vector in situations 
in which we cannot define a velocity vector, it follows that we can make 
sense of the notion of absolute spatial acceleration or absolute spatial 
rotation without appealing to an idea of motion in an unchanging space 
which persists through time. In other words, we do not have to appeal 
to the idea of a space in which we can uniquely identify points through 
time, and in which, therefore, we can tell whether or not a change in 
spatial position has ‘really’ taken place. Hence the inference from force- 
effects to accelerations to ‘real’ motion simply is not valid. Now it was 
argued earlier that, if the relationist is to avoid the difficulties suggested 
by Lacey’s claim (#), he must construct a version of Newtonian theory which 
is observationally equivalent to but logically incompatible with Newton’s 
version. In particular, it was argued that the relationist must reinterpret 
Newton’s principle, ‘all centrifugal force-effects are induced by rotational 
motion’, in such a way as to avoid committing himself to the presence of 
motion in the two-sphere universe. I believe that we can see, now, how 
the relationist can accomplish this: he can reinterpret Newtonian theory 
by replacing the space-time N by either of the space-times F or C. In 
particular, the relationist need only adopt either of the theories F* or C*, 


` 1 For the fact that Vata = o, see Trautman [1965]; or Misner, Thorne and Wheeler [1973], 
chapter 12. 
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and he can consistently deny the existence of absolute space (in the sense 
of N) without denying the truth of Newtonian mechanics. 

The relationist can argue in a similar manner with respect to Lacey’s 
claim (#). Recall that, according to this claim, the presence of tension in 
the rope in the two-sphere universe proves that the spheres are moving 
in relation to absolute space. But again, the relationist can consistently 
admit that there is tension in the rope, while denying that the spheres are 
experiencing an absolute change of position, simply by interpreting the 
two-sphere universe in terms of F# or C*. Thus the relationist can main- 
tain that the presence of tension in the rope indicates that the spheres are 
accelerating, in accordance with Newton’s second law. But it is consistent 
for him to deny that the spheres are experiencing an absolute change of 
position. Indeed, this latter notion is not even intelligible in the context 
of F* or C®. 

Now there is at least one reply which the absolutist might make to the 
relationist approach just outlined. This reply would be directed specifically 
at the relationist who bases his objection to absolute space on general 
reductionist principles, and who rejects the claim that any theoretical 
entity actually exists. The absolutist might reply to such a reductionist 
that, while Newton’s thought experiment does not demonstrate that space- 
time definitely has the structure N, it does demonstrate that space-time 
definitely has one or the other of the structures, N, F, or C. And, according 
to this argument, this result is strong enough to show that the reduction- 
ist’s position is untenable. For if the reductionist’s ultimate concern is to 
rid his ontology of everything except observable things or observable 
features of observable things, then the non-denumerable infinity of instan- 
taneous spaces posited by F* and C* should be just as distasteful to him 
as is the single enduring space associated with N. 

The absolutist could attempt to justify this argument by recalling 
Lacey’s claims (1) and (#). It was suggested that the relationist’s only 
adequate response to (1) and (#) involved his reinterpreting Newtonian 
theory in terms of some alternative to absolute space. It was suggested, 
further, that such an alternative is provided either by F# or by C*. But 
it is certainly the case that, although there is no such thing as absolute 
change of position in F* or C*, one does have to define acceleration in 
these theories. And one defines acceleration in these theories in terms of 
the affine connection and the instantaneous spaces. In other words, one 
can define acceleration in these theories only by invoking autonomous 
entities which exist independently of material objects. Hence if Newton’s 
thought experiment does, indeed, establish that the reductionist must 
reinterpret Newtonian theory in terms of some alternative to absolute 
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space; and if it is also the case that the only such alternatives make essential 
reference to such non-material entities as the affine connection and the 
instantaneous spaces; then the reductionist does seem to be forced to 
accept the existence of these non-material entities. In terms of the specific 
argument discussed earlier, the absolutist might state this conclusion by 
claiming that we should infer the following from (16°): 


(16”) There must be some structure, not identical to the system consist- 
ing of the two spheres and rope, which enables us to define 
absolute rotation. 


The difficulty with this argument is that it seems to presuppose the 
validity of absolutist or realist principles. Thus according to the realist, 
to say that one must reinterpret Newtonian theory in terms of some alterna- 
tive to absolute space is to say that we must replace Newtonian theory 
by another theory, such as F* or C*, which is logically incompatible 
with Newtonian theory. Hence for the realist, such a move involves 
replacing one set of ontological commitments by another set; for ex- 
ample, it involves replacing absolute space by a non-denumerable 
infinity of instantaneous spaces. But now suppose that the reductionist 
has adopted the kind of conventionalist strategy outlined in section 3. 
Then he will argue that we are committed only to the existence of those 
objects and events which it is possible locally and directly to observe. In 
particular, he will argue that, if theories which are apparently logically 
incompatible are equivalent with respect to all possible observations, 
then they are not incompatible at all. Instead, they represent different 
formulations of the same theory. As we have seen, the conventionalist 
argues in precisely this manner with respect to the theories N*, F*, 
and C*; that is, he infers from the presumed fact that these theories are 
empirically underdetermined, that there is no fact of the matter as to 
whether N, F, or C represents the real structure of space-time. Now what 
(16’) tells us is that Newton’s thought experiment cannot resolve this 
underdetermination; while the realist will identify the ‘structure’ referred 
to in (16’’) with some non-material spatial entity, the conventionalist will 
view the ‘indeterminacy suggested by the phrase, ‘some structure’, as 
supporting his position. For it is not the case that Newton’s thought 
experiment provides a set of observational data which can be suitably 
interpreted only in terms of absolute space. Instead, this data can be 
interpreted equally well by N*, F*, or C*. And given that N*, F*, and C* 
are thus observationally equivalent, the conventionalist can invoke his 
principle, ‘Empirical underdetermination of theories implies no fact of 
the matter as to which theory is correct’. 
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It is clear, I think, that Newton’s thought experiment does not provide 
any grounds for questioning the validity of such a general principle. 
Moreover, as we have seen, his experiment does not enable the absolutist 
to refute, without begging the question, the argument of the conventional- 
ist who accepts such a principle. Therefore, Newton’s experiment cannot 
resolve the dispute between the absolutist and the conventionalist. 

I have argued that it is consistent for the conventionalist to maintain 
that the principles of Newtonian theory are true, that these principles 
express physical necessity, and that the global structure of space-time is a 
matter of convention. Of course, as I have presented it, the conventional- 
ists position rests on reductionist principles which may themselves be 
rather dubious. However, I have suggested that when we attempt to 
assess the relative merits of the conventionalist and absolutist positions, 
it is principles of this kind which we must examine. In particular, then, 
the ontological status of absolute space in the context of Newtonian theory 
is not simply an empirical matter. The question of whether the con- 
ventionalist can avoid reliance on general reductionist principles by basing 
his position on features which are peculiar to Newtonian theory is, I feel, 
worth further study. 


University of Wisconsin, Milwaukee 
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Towards an Objectivist Account of 
Theory Change” 


by ALAN CHALMERS 


The accounts of theory change offered by Karl Popper and Imre Lakatos 
do not live up to the objectivist aspirations of those authors. Part of the 
trouble stems from the role they attribute to the decisions and choices of 
individual scientists or groups of scientists. It is this feature of Lakatos’s 
methodology that makes it possible for Paul Feyerabend ([1975], p. 200) 

to describe it as an. ‘anarchism in disguise’. In this paper I aim to draw on 
= Lakatos’s methodology to construct an objectivist account of theory 
change that removes decisions and choices from a position of epistemo- 
logical primacy, an account that can withstand some of Feyerabend’s 
criticism. My account exploits to maximum advantage the notion of an 
objective opportunity. 

Decisions of a variety of kinds figure in Popper’s methodology of 
science. In the present context the important point is that for him ‘it is 
decisions that settle the fate of theories’! This theme is also evident in 
Lakatos’s elaboration of his methodology of scientific research programmes. 
In his [1971a] (p. 100) he sought for ‘rules for the elimination of whole 
research programmes’ and in his [1970], (p. 157) he wrote of the rejection 
of a research programme as ‘the decision to cease working on it’. So long 
as accounts of theory change, or change of research programme, are 
framed in such terms it is easy for Feyerabend ([1970], p. 215) to describe 
them as verbal ornaments and to capitalise on their failure to enhance his 
policy of ‘anything goes’. For no methodology is capable of accounting for 
scientific progress by dictating that under certain specified conditions 
theory A must be rejected, theory B must be adopted, or theory A must be 
preferred to theory B. A theory or research programme, however sorry a 
state it is in, may always take a turn for the better, and no methodology 


* This paper has benefited from criticisms of an earlier draft by Jean Curthoys, Denise 
Russell, Wal Suchting and John Worrall. 

1 Popper [1935], p. 108, emphasis in original. The decisions to which Popper refers here 
are those concerning the acceptance or rejection of basic statements. Other decisions 
are involved in Popper’s methodology such as, for example, those that serve to demarcate 
a theory under test from unproblematic background knowledge. 
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can be powerful enough legitimately to rule out such a possibility. Speaking 
of Popper’s methodology, Alan Musgrave ([1974a], p. 584) points out 
that ‘a scientist need not always “choose” that theory with the highest 
degree of corroboration’ while, with reference to Lakatos’s methodology of 
scientific research programmes, Feyerabend ([1970], p. 215) observes that 
‘what may look like a degenerating problem shift may be the beginning of 
a much longer period of advance’. 

Lakatos and his followers seem to appreciate the force of this kind of 
criticism to some extent. It led Lakatos ([1971b], p. 194) to assert 


... when it turns out that, on my criteria, one research programme is “progress- 
ing’ and its rival is ‘degenerating’, this tells us only that the two programmes 
possess certain objective features but does not tell us that the scientist must 
work only on the progressive one... 


But if the matter is left there, then the methodology of research programmes 
has failed to yield an adequate account of scientific change.! 

I suggest that the way can be cleared for a solution to the problem if a 
distinction is made which has not been adequately appreciated or adhered 
to within developments and applications of the methodology of research 
programmes. The distinction is between two kinds of problem. The first 
is concerned with an objectivist account of theory change. The second 
concerns research strategies and the guidance and evaluation of the 
choices and decisions of scientists. For convenience I will refer to this 
latter kind of problem as the problem of choice. If scientific change is 
understood in terms of the choices and decisions of scientists then of 
course the two problems might well collapse into one. When in his [1973], 
Elie Zahar addressed the question ‘Why did Einstein’s theory supersede 
Lorentz’s?’ he was dealing with a problem of the former kind. But when, 
in that article, he posed the problem ‘Why did brilliant mathematicians and 
physicists like Minkowski and Planck abandon the classical problem in 
order to work on Special Relativity?’ he was dealing with the problem of 
choice. I do not wish to maintain that the two problems are unconnected. 
In particular, a solution to the problem of change would have a bearing 
on the problem of choice. I regard Lakatos’s methodology as being capable 
of shedding light on both kinds of problem. My main concern in this 
paper is the problem of change. I aim to indicate how Lakatos’s method- 
ology can be utilised for the construction of a fully objectivist account of 
theory change that is concerned solely with ‘knowledge in the objective 
1 Musgrave has attempted to save the situation in his [1976] but I think Feyerabend has 

successfully shown in his [1976] that the shift from the individual scientist to the scientific 


community as a whole, introduced by Musgrave, fails to solve the fundamental 
problem. 
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sense’! and which takes full heed of ‘the cleavage between objective know- 
- ledge . . . and its distorted reflections in individual minds’.? 

The objectivist account of scientific change draws on a conception 
closely associated with, but not identical to, Lakatos’s notion of the 
positive heuristic of a research programme. For Lakatos, the positive 
heuristic is a research policy. It consists of ‘a partially articulated set of 
suggestions or hints on how to change, develop the “refutable variants” 
of the research programme, how to modify, sophisticate, the “‘refutable”’ 
protective belt’.? The conception I wish to employ is somewhat different. 
It refers to the set of objective opportunities for future development 
inherent in a programme. It is what we might call the ‘degree of fertility’ 
of a research programme. Objective opportunities for development will 
exist within a programme whether or not scientists realise it and whether 
or not those opportunities are taken advantage of. With respect to a 
particular programme, one research policy or set of suggestions or hints 
may be more appropriate than an alternative in the light of the objective 
opportunities that in fact exist. An important property of a research 
programme, then, will be its degree of fertility, the extent to which it 
offers opportunities for future development, the number of new avenues 
it opens up. 

Galileo’s mechanics is an example of a programme with a high degree 
of fertility in the sense intended. Because of the way motion was mathema- 
tised in that theory, the opportunity existed for bringing the machinery of 
mathematics into operation for the derivation of numerous theorems 
concerning motion. As M. Clavelin ([1974], p. 271) puts it, ‘once it has 
been turned into an autonomous phenomenon, in the manner of a mathe- 
matical object, motion can be defined genetically and its fundamental 
properties studied methodically’. Galileo took advantage of some of those 
opportunities in his Two New Sciences. Stillman Drake ([1970], p. 97) has 
stressed the extent to which Galileo’s theory opened up new avenues for 
future development. He writes, 

It was Galileo who, by consistently applying mathematics to physics and 
physics to astronomy, first brought mathematics, physics and astronomy together 
in a truly significant and fruitful way. The three disciplines had always been 
looked upon as essentially separate; Galileo revealed their triply paired relation- 


ships, and thereby opened new fields of investigation to men of widely divergent 
interests and abilities. 


1 Popper writes: 
Knowledge in this objective sense is totally independent of anybody’s claim to know; 
it is also independent of anybody’s belief, or disposition to assent; or to assert, or 
to act. Knowledge in the objective sense is knowledge without a hnower: it is knowledge 
without a knowing subject. (Popper [1 972], p. 109; italics in the original.) 

3 Lakatos [19714], P. 99. * Lakatos [1970], p. 135. 
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The Lakatosian case-studies that have already been produced in support 
of the methodology of scientific research programmes themselves serve 
to illustrate the appropriateness of the notion of degree of fertility. For 
example, Zahar [1973] has argued for the superiority of Einstein’s theory 
of Special Relativity over Lorentz’s in 1905 by noting that principles of 
the former were formulated in a very general way that made possible their 
detailed application in many areas of physics. In the terminology I am 
proposing, Einstein’s theory presented objective opportunities for future | 
development to a greater degree than Lorentz’s and so possessed a greater 
degree of fertility than the latter. 

Once attention is focused on the objective opportunities for development 
contained within a programme, an account of theory change becomes 
possible that avoids the difficulties encountered by Popper and Lakatos 
and which is not subject to much of the criticism Feyerabend has levelled 
at those authors. The account presupposes that the society in which the 
change of theory or research programme takes place supports the science 
in question, so that a large number of scientists with the appropriate 
skills and aptitudes can be assumed to be available to work on the rival 
theories or programmes. If this assumption is satisfied, then, given two 
rival programmes, if one possesses objective opportunities for development 
to a high degree and the other does not, then the former will progress at a 
greater rate by virtue of that fact. This will be the case even if the majority 
of scientists were to decide to work on the programme with the smaller 
degree of fertility. In such an event the luckier or more perceptive minority 
would soon meet with success while the majority, those representing the 
consensus view, would try in vain to take advantage of non-existent 
opportunities. Francois Jacob ([1976], p. 11) captures the spirit of my 
position when he writes, 


[I]n this endless discussion between what is and what might be, in the quest 
for a chink revealing another possibility, the margin of freedom of the individual 
investigator is sometimes very narrow. The importance of the individual 
decreases as the number of practitioners increases: if an observation is not made 
here today, it will most frequently be made somewhere else tomorrow. 


My position can be made clearer with an example I have had occasion 
to use before.? It is an extension of an example used by Popper ([1972], 
pp. 116-17) to illustrate the objective character of problem situations. 


1 This is, of course, a major assumption that will need to be defended in any particular 
case. I would not want to discount the kind of situation where political and social 
pressures lead to the suppression of a potentially powerful programme and the fostering 
of a weak one. The assumption on which my argument depends would presumably be 
roughly valid for nineteenth-century physics. 

2 Chalmers [1979]. 
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We compare a garden in which there are a large number of nesting boxes 
with a second, otherwise similar garden, in which there are no nesting 
boxes. Given that the environment of each garden is suitably populated by 
birds then it is highly likely that after some months or years many more 
birds will have nested in the garden furnished with nesting boxes than in 
the other. ‘This eventuality is adequately explained in terms of the objective 
opportunities for nesting offered by one garden as opposed to the other. 
The important point about this example for my purpose is that there will 
be no need to refer to the decisions of birds and the rationality or otherwise 
of those decisions in the explanation. 

The account of theory change that I have offered needs to be qualified 
in an important way. The fact that a programme presents opportunities 
for development is no guarantee that those opportunities will lead to 
actual success when taken up. Whether or not a research programme 
supersedes a rival will depend, not solely on its degree of fertility, but also 
on its success in practice. The position I have outlined needs to be 
augmented by a theory of novel predictions. E. Zahar [1973] and A. 
Musgrave [19745] have made interesting attempts to improve on Lakatos’s 
treatment of this essential component of an objectivist account of theory 
change. 

The case-studies that Lakatos and his followers have carried out as an 
illustration and test of the methodology of scientific research programmes! 
can be readily interpreted in a way that is consistent with my modification 
of the methodology. Theory change can be understood as coming about by 
virtue of the fact than an established theory was challenged and ousted by 
a rival that offered more objective opportunities for development, some 
at least of which bore fruit. This contrasts with attempts to explain theory 
change by reference to the rationality or otherwise of the decisions and 
choices of individual scientists. Once the Lakatosian case studies are 
rewritten so as to conform to my objectivist version then much of the 
criticism levelled at them by Feyerabend and others can be neutralised. 
A. Musgrave ([1976], pp. 286-8) comes very close to the view I am 
advocating when he puts himself in the shoes of a late eighteenth century 
chemist comparing Lavoisier’s oxygen programme with the phlogiston 
programme. He lists six kinds of opportunities for research provided by 
the oxygen programme but is ‘at a loss’ to find similar opportunities in the 
phlogiston programme. On my account this is very nearly sufficient to 
explain the triumph of one programme over the other. Musgrave, by 
contrast, finds it necessary to use his description of the objective situation 
to illustrate what advice the eighteenth century chemist might give to a 

1 For instance, those in Howson (ed.) [1976]. 
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young man entering the field. On my objectivist account, advice and 
decisions will have an important bearing on what I have referred to as the 
problem of choice but do not constitute the reference point for a solution 
to the problem of theory change. 

I have attempted to open the way for an account of theory change that 
removes conventional decisions and choices from a position of epistemo- 
logical primacy. In doing so I do not wish to suggest that the individual 
subject can be removed entirely from the activity of science. Science would 
not progress or even exist in the absence of decisions and actions of 
individual scientists. Further, it may take a particularly ingenious or 
‘creative’ individual to discern an opportunity that everyone else has 
missed. However, the significance and consequences of those decisions 
and actions must be evaluated with reference to the objective situation. 
It must also be admitted that individual scientists or groups of scientists 
will be able to affect and change the objective situation the more efficiently 
the better they understand it. 


University of Sydney 
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self-knowledge, Uncertainty, and Choice 
by FREDERIC SCHICK 


This paper takes up some problems in the theory of choice, problems 
arising from questions of where and when we can choose at all, or see any 
point in choosing. Some of these problems have a long history, others 
have come up only lately. The older problems have outstayed their time. 
They need to be reminded of this and pressed again to retire. The others 
are elusive and can be troubling. I will try to come to grips with these. 

A meta-problem here is how to tie all this together. I shall draw on the 
fact that it has been done before. The British economist G. L. S. Shackle 
has linked up much the same issues. Shackle discusses three problems. 
He holds that some radical steps must be taken in order to get around 
them. In this last, I think he is wrong, but we needn’t stop for that now. 
We shall set out what follows under Shackle’s three problem-headings, 
and start each time with Shackle himself. 


x Suppose that everything must happen exactly as it does. ‘. . . suppose 
that history is a book already written, whose pages the hand of fate is 
merely turning, not composing’ (Shackle [1966], p. 72). What we shall be 
doing at every moment in our lives is fully noted in the book. In a world of 
the sort described, can we choose what we do? Shackle says that we can’t. 
In such a world, what people think their choices ‘...can be no more 
than the clicking of the machine as it works, and...is... something 
wholly different from that explosion of essential novelty which they seem 
to be the person whose tense thought and feeling give them birth’ ([1969], 
p. 3). Ina fully pre-set world, choice is tllusory. 

Shackle takes the world of the book to be determined throughout. He 
sees determinism as ruling out freedom, and so—since he holds that a 
choice must be free—as also ruling out choices. The problem here is 
ancient. It goes back at least to Lucretius: either we reject determinism or 
we give up thinking we choose. Along with many other humanists, Shackle 
rejects determinism. 

We are asked to decide between determinism and choice. I will be 
very brief here, for I follow Hume in thinking we can have them both. 
Hume went at it in two ways. First, he set out determinism as compatible 
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with freedom. Granted that all events are determined, for all events are 
caused, those that involve us as agents included. But where we are agents, 
we do what we do because we want to do it. This is what makes us agents 
there. It also makes what we do there free. My taking a shower, and then 
making some coffee and going for a walk are all caused, and so they are 
determined, but they are caused by my wanting to do them, and so they 
are also free, Determinism allows for freedom. And so it allows for choices. 
It may be said that how we will choose is also already in the book. On 
the determinist doctrine, a choice must always have been what it was. 
It is never open to us not to want what we wind up doing. How then is 
the outcome up to us? Hume’s second (and basic) line challenges the idea 
of unavoidability. All events are caused, but this means only that each 1s 
preceded by another on an invariant pattern. Causality is a matter of 
regularity, not of any sort of compulsion. It has to do with sequences only 
not with pushes and shoves (in Hume’s terms: with conjunction, not with 
connection). On a Humean view, necessity is no part of the picture. We 
might always have chosen otherwise. Alternatives were always open. 
There are of course the familiar issues. What is a causal regularity? 
What distinguishes a causal sequence from a mere coincidence? Or go 
back to the preceding paragraph: what sets off my doing something because 
I want to do it from my doing it while (or right after) wanting to? These 
are dreary questions, but better, I think, to slog along here than to have 
to settle for either determinism or choice. Humean determinists are not 
in the clear, but at least they needn’t believe that choices are all illusory. 
None of this discredits the possibility of Shackle’s great book of history. 
Indeed, the whole question of determinism can be skirted. Even if 
determinism were false, the book of history might still exist—the future 
need not be determined in order to be foreseen. Here we approach a 
second problem, the problem of the foreknowledge of our future. There 
are in fact two separate issues under this new heading. One, again, will 
not hold us long. The other is harder to get at, and also more rewarding. 
The first of these issues has an ancient lineage. Some of the early 
Christian thinkers agonised long and hard over it. For them, the problem 
arose from their assumption of the omniscience of God. If I am going to 
do x, then God, since He knows everything, knows that I will do it. This 
means that it is true beforehand that I will do x, for no one (not even God) 
can know what is not true. Since it is true beforehand, it will be true 
at the point of action. It will be true necessarily there, for it can’t possibly 
be true at one time and not at another. So I will have to do x—will 
necessarily have to. This works out without a word about how I am going 
to choose. But then, what choice will I have? 
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Or look directly at choices. God knows beforehand whether or not I 
will do x. Being omniscient, He also knows whether I will choose to do it. 
If He knows that I will choose this, then I must of course choose it. If I 
must, it’s out of my hands. Again, what sort of a choice is that? ‘If God 
forsees all things and cannot in anything be mistaken, then that which 
His Providence foresees must result; wherefore if It knows beforehand 
not only men’s deeds but even their designs and wishes, there can be no 
freedom of judgment’ (Boethius, Cons. Phil., bk. 5). 

Unbelievers ought not to smirk. The same predicament can also be got 
to without bringing in God. Suppose that you knew yesterday that I 
would do x today. Again we can work our way to the conclusion that I 
have no choice about it. Indeed the problem is simpler still—no one at 
all need foreknow the fact. Aristotle saw this clearly (De Interp., ch. 9). 
His form of the argument is this: if I am to do x today, then it was always 
true in the past (even if no one knew it then) that I will now do x, and so 
I must now do it. My doing x is a fait accompli, and thus there is ‘. . . no 
need to deliberate or to take trouble.’ 

Aristotle avoids this conclusion by rejecting the first step toward it. 
He holds that a proposition about an event is never true before that event, 
nor is its negation false. This makes trouble for the theory of knowledge. 
A second, less costly, line is Ryle’s ([1956], ch. 2). Ryle notes that the 
argument involves a sleight-of-hand with the terms of necessity. Granted, 
if it was true yesterday that I would do x today, then it is necessarily true 
that I will do it. But parse this conditional properly, and there is no 
problem. The conditional is not: if it was true yesterday that I would do 
x today, then it-is-necessarily-true-that-I-will-do-it. Rather, it is: if it was 
true yesterday that I would do x today, then-it-is-necessarily-true-that 
I will do x today. Formally, what will be granted is not: if (yesterday) p, 
then necessarily-p; but only: if (yesterday) p, then-necessarily p. The 
necessity operator qualifies not the consequent but the connective. It serves 
only to indicate that the if-then structure is analytic. It has no force left 
over to establish fatalism. 

Let us move to the second, deeper issue of foreknowledge. This takes 
the foreknowing party to be the agent himself. The agent here is not omni- 
scient. He does know, however, how he will choose. This is the form of 
the foreknowledge problem considered by Shackle, though he brings out 
only a special case. Suppose that the agent 


... has in mind what he accepts as a complete list of all the relevantly distinct 
actions open to him, and that for each of these available actions he sees one 
and only one possible sequel differing from that of every other of the available 
actions....If we now further assume that these sequels can be completely 
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ordered by him so that, when he compares any two of them, one is preferred 
to the other, it is plain his available actions will be similarly ordered, and that 
the question ‘Which action shall I take? will call for nothing which can be 
called judgment . . . I shall say that, in such a case, decision is empty’ ([1966], 
PP. 73-4). 

We might put it this way. If we foresee our choice-situation (our 
preferences, our options, their sequels and the rest), then we can work 
out how we will choose. Suppose we expect to act as we choose—call this 
the assumption of effectiveness. Then if we foresee our situation, we can 
see also how we will act. There is nothing left to consider, no occasion 
for judgment: perfect foresight empties choice. 

No doubt this too has a history. There must have been some medieval 
who asked whether the omniscience of God extended to His own future, 
and if so, how come His choices were not empty. But no record is left of 
this thinker. The recent discussions start from scratch—see Ginet [1962], 
Pears [1968], and Goldman [1970], chapter 6. Note that the Rylean 
response to the two-person problem (one person foreknowing the other’s 
choice) gets us nowhere in the one-person case. ‘This issue can’t be reduced 
to something independent of a foreknower. What here empties a person’s 
choice is not its being true beforehand that he will choose that way, but 
his own knowing that he will. If he knows how he will choose, he also 
knows how he will act. But then he faces no issue, for knowing that he 
will do x, he cannot consider whether to do it. And where he faces no issue, 
he has no choice to make. 

Shackle presents the problem in a very narrow form. He speaks only 
of someone’s foreknowledge emptying his choice where this person 
‘,.. supposes himself to know precisely, completely, and for certain 
what consequences for himself would flow from any given one of the 
available acts’ [1969], p. 4). This agent faces no risk. Foreknowledge 
empties choice in the wider world too. Richard Jeffrey [1977] brings out a 
more inclusive problem. 

Let o” be the agent’s option of doing x, and T be any logical truth. It is 
a thesis of utility theory that the utility of a proposition w is the weighted 
average of w’s coming out true in this contingency or in that—given that 
these are exclusive and exhaustive—the weights being the probabilities 
(conditional on w) of these contingencies. (A special case appears below 
as (19).) Jeffrey presents his problem as follows. 


Suppose you [the agent] think it within your power to make 0° true if you wish, 
that you prefer 0° to 7, and that you are convinced that o* is preferable to every 
other one of your options. Then the probability of o” is 1, for you know you 
will make 0° true. But then, [setting v = o” and w = T in (19)]. . . your utility 
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for o” is equal to your utility for T, so that . . . your are indifferent between o” 
and T after all, in contradiction to the assumption that you prefer o” to T 
([x977], p. 136; this has been touched up slightly to square with our notation 
here). 


I shall lay out the larger problem in a way closer to Shackle’s thinking 
and hold off Jeffrey till later. Suppose that the agent knows what the 
sequels of his options would be in each of various contingencies. (Shackle’s 
special case is that in which the agent sees one contingency only.) Reflecting 
on the possibilities, the agent may realise that his doing x—again, this 
option is o*—offers him more than anything else. Aware of his general 
rationality, he may infer that he will choose o”, and so, assuming effective- 
ness,. that o” will come out. Can this person choose here? He knows that 
he will do x, so he has no issue of whether he will do it. What is there 
left for him to choose? 

I have been speaking of this as a problem of the foreknowledge of 
choices, but knowledge comes into it only indirectly. Basically, it is a 
problem of forebekef. Let me put it more fully. Suppose that the agent 
thinks that his issue will be made up of options 0,, Og, . . . Om, and that 
Sip Sigs «+ » Sin} Sats Sap » + + Sani > + « Smis Smp + + « Smn Would be the sequels 
of these options in the contingencies he sees. (An option here is a live 
option: not just anything the agent might do, but one of a set of possible 
actions that, conjointly, raise an issue for him.) Let the agent also think 
that his over-all preference ranking will be R. If R meets certain con- 
ditions of specificity, it follows (I show how in the second part) that the 
agent’s options will have determinate expected utilities. Let the agent 
believe that he will choose rationally—that he will choose an option 
whose expected utility won’t be exceeded by that of any other—and let 
him believe that the choice he will make will be effective. Suppose that 
he will not change his mind on any of this before he chooses. It follows 
from what he believes that some set S of his options is the set from 
which he will choose one, and that this option will come out. To simplify 
a bit, suppose that S contains only the single option o”. The agent is now 
committed to believing that he will do x. He will also be bound to believe 
this at the point of choice. If he does believe it then, he won’t have any 
issue of whether to do x or not, and so won’t have any choice to make. 
Foreseeing his choice-situation precludes his having a choice. 

We can generalise further. Shackle and Jeffrey both find a problem 
for rational-choice theory here. But the agent’s assumption that he will 
choose rationally can give way to others. Let the agent believe that he will 
pursue choice-policy C—whether this is maximisation, or something else, 
whatever it is, as long as it allows him to predict his choice from the rest 
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of his information. (This last clause means that policy C might be that 
of maximining or maximaxing or the like, but can’t be that of flipping a 
coin or of consulting some oracle.) Given any suitable C and all the rest 
of what he knows, the agent is logically bound to believe that he will do x 
(or it may now be y). Again, this settles whether he’ll do it, and so dissolves 
his issue. The problem remains: foresight undoes choosing. 

Shackle is impressed by his version of this. He thinks it goes to the 
heart of our theory of how people choose, and that it shows we need a 
new theory. We must, for a start, deny the agent is certain regarding his 
choice-situation. We shall get to Shackle’s idea of uncertainty later. 

Speaking of his own version, Jeffrey finds no difficulty. He holds that 
‘... the alleged contradiction arises through conflation of two different 
preference rankings: Yours before deciding to make o* true, and yours 
thereafter’ ([1977], p. 137). Your preferring o” to T is part of your pre- 
choice ranking. Your being indifferent between them is part of your 
ranking post-choice. Jeffrey rejects the possibility that you might know 
you will do x and so be indifferent before you choose. (Asked to comment 
on Shackle’s discussion, Jeffrey would say you don’t have a choice only 
where you already have made it.) 

We spoke of a person’s foreseeing his issue and his preference ranking. 
Jeffrey denies this ever happens. He holds that in thinking out what to 
do we set up the options from which we will choose and clarify our 
preferences. We are never done with this until the point of choice is 
reached. Or perhaps the other way: where we wind up deliberating, there 
we choose on the spot, even where we can’t yet act. So we are never in a 
position to predict how we will choose. As soon as we settle our minds, 
the choice is made and all is over. 

Granted we often don’t known till we get there where we will be. But 
things are not always this much in the dark. Sometimes we know very 
well, but hope that a new option will still come up—that in fact we won’t 
face the issue we now anticipate facing—or even that we won’t have the 
preferences we now expect to have. So too for the factors that Jeffrey 
doesn’t mention. Sometimes the agent sees the possible sequels before he 
chooses. Sometimes he does not see them until the point of choice. Some- 
times he sees his choice-policy; sometimes indeed not. Sometimes he sees 
his choice will be effective, and again, sometimes he doesn’t. Where he 
does see all this beforehand, the problem is in force. 

Let me lay out the problem once more, and bring out its basic structure. 
Suppose that an agent A is what we shall call deductively thorough, that 
he believes all the deductive consequences of whatever he believes. Also 
that he is beltef-retentive, that between now and the point of choice he will 
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not drop any belief that he holds. (For short, say simply that A is thorough 
and retentive.) Suppose also that A believes that 


(x) A’s issue at the point of choice will be made up of options 0,, 0», 


40 
(2) The sequels (conditional sequels) of these options would be s,,, Sis, 
6s Sino Seis Seay 05 Sgn; «03 § Sey E os Stans 


(3) A’s preference ranking at the point of choice will be R, 
(4) A will pursue choice-policy C, and 
(5) A’s choice will be effective. 
We are assuming it follows from (x1}-(4) that 
(6) A will chose o°. 
It follows from (5) and (6) that 
(7) It is true that o”. 


Note that (7) does not contradict (1)-(5). Notice in particular that it does 
not contradict (1): the fact that A will do x does not preclude his facing 
an issue of whether or not he will. | 

But let us recall A’s total situation. We are supposing that 


(8) A believes (1}{5). 

Since A is deductively thorough, it follows that 
(9) A believes A will choose o”, 

and also that 


(10) A believes it is true that 0”. 
Since A is belief-retentive, it follows that 
(x1) A will at the point of choice believe it is true that o”. 


Clearly, (8) alone is not troublesome either. But (1}-{5) plus (8) make for 
trouble, for this (given A’s thoroughness and retentiveness) implies (11), 
and (11) undercuts (1). Since A bekeves that he will do x, he cannot ask 
himself whether he will. None of the possibilities mentioned in (1) remain 
(live) options for him. Where he sees what he will do, there is no issue left. 

Now (8) itself refers to (x}-{5), so there is something wrong with these 
five propositions. They are not inconsistent. But if A believes them all, 
and is thorough and retentive, then at least one of them must be false. 
Hintikka [1962] has studied the logic of some closely related cases. In his 
terms, a set of propositions that could not all be true in a world in which A 
were thorough is indefensible for A. All sets of inconsistent propositions are 
indefensible for A, but the converse of this does not hold. The two 
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propositions A believes v and A does not believe w are not inconsistent, 
where v is not the same as w; but if v implies w, these two are conjointly 
indefensible for A. A set of propositions that could not all be true in a 
world in which A were thorough and believed these propositions is 
doxastically indefensible for A—for instance, the pair of propositions v and 
A does not believe w, where v and w are as above. Enlarging on this now, a 
set of propositions that couldn’t all be true in a world in which A believed 
these propositions and was thorough and also retentive through some period 
t will be said to be doxastically t-indefensible for A, or for short, d. 
t-indefensible for him. The set of propositions (1}-(5) is d. t-indefensible 
for A, t here being the period from now until A chooses. There is no 
world of the sort described in which they could all be true. So À can’t 
properly believe them all. 

What does ‘properly’ mean here? Someone believes something properly 
where he can defend it without looking foolish. (This is very weak, but 
that is as it should be.) A person can’t properly believe something if he 
can fend off a challenge only by denying that he is thorough or that he 
believes what he is defending or that he will continue to believe it. 
Thoroughness and self-awareness are part of being rational—I argued 
this in Schick [1966]. Retentiveness through ż is no part of rationality, 
but where ¢ starts now and is to be short (take this for granted above), 
no one can disclaim it without sounding frivolous, or perhaps senile. 
A can’t properly believe (1)-(5) because he cannot face up to the challenge 
that these propositions backfire. He can only defend these propositions 
at the cost of discrediting himself. 

Again, what A believes is not inconsistent. But if we make some modest 
assumptions about what A is like, we can prove that what A believes is 
false. A can rebut only by contesting these assumptions, by insisting that 
he is not self-aware, or isn’t thorough, or not retentive. He can hold on 
to (1}1{5) only by arguing that he is foolish (better perhaps: shallow). A 
sensible A would rather give up (1)-(5). 

A is no better off if he does not believe (1}-(4) but does believe the 
derivative proposition (6) along with (5). For the set of propositions (5) 
and (6) is also d. ¢-indefensible for A, t being as above. (I am here sup- 
posing that a person can only choose what had been an option for him 
before.) It follows that the unit set of 


(12) A will effectively choose o” 


is d. t-indefensible for A. Again, this does not say that (12) is self- 
contradictory, or even that (12) is false. What it says is that (12) plus A’s 
being thorough and retentive and believing (12) is self-contradictory: that 
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tf A is thorough and retentive and believes (12), then (12) is false. It says 
that A cannot properly believe (12). We might put this another way. Since 
knowledge is justified or proper belief, A can’t know (12). A person can’t 
ever know how he will (effectively) choose. (Thus the problem of forebelief 
comes to be one of foreknowledge after all.) 

Jeffrey's form of the problem is also covered. A here believes that 


(13) A will prefer o” to T, 

(14) A will prefer o to all his other options, 

(15) A will choose that option which he prefers to all the others, and 
(16) A’s choice will be effective. 


These propositions are not contradictory. The problem is rather that our 
norms of self-awareness and the rest keep A from being able to defend 
them. Given the usual utility theory, the set of (13)-(16) is d. t-indefensible 
for A. Jeffrey’s own intuitions come close: ‘In the light of your awareness 
of your options, your awareness of your preference for . .. o” over T reduces 
that preference to indifference’ ([1977], p. 136; emphasis added). 

This has to do with self-knowledge only. The two-person case goes 
smoothly. (12) is not d. t-indefensible for B, but only for A. B can properly 
believe (12). He can know how A will choose. He can’t, however, convey 
what he knows about 4 to A himself. This last reverses a familiar philo- 
sophical thesis on privacy. The usual thesis is that there 1s much about 
us that we can know but no one else can, unless we choose to tell them. 
The point here is that others can know things about us that we ourselves 
cannot know, not even if these others tell us. Privileged access turns out 
to go in both directions. 

Time now to sum it up. We have seen that logic alone rules out our 
knowing the whole truth about ourselves. Where we will (effectively) 
choose during t, self-omniscience is out during t. This is the basic con- 
clusion above. How disturbing is it? Not very, I think. No one has any 
commitment to full self-knowledge. We want to know ourselves better. 
Some of us only want (for protection) to know ourselves better than others 
do. This is not ruled out. All that must go is expendable. 

Our conclusion is moreover not the first of its kind. We have been 
speaking of the foreknowledge of choices. Popper [1950] concluded the 
same about the foreknowledge of discoveries, or better: of realisations. 
A must be uneasy with the proposition that he will realise that p. In our 
terms, the unit set of this proposition is d. t-indefensible for him, ¢ here 
being the period from now until his realisation of p. The proposition 
implies that p, for a person can’t realise something not true. So if A 
believes this proposition, and is deductively thorough, he believes also 
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that p. If he is retentive through 7, he will still believe p when he realises it. 
But this contradicts itself, for no one can realise (or come to believe) what 
he already believes. It follows that A can’t properly believe that he will 
realise p. He can’t know he will realise this. (Popper’s own argument is 
rather different. Also, he holds it disproves determinism, but this now 
not in Humean terms but in those of predictability.) 

Back to our result about choices. Let me stress the weakness of this. It 
does not say that we can’t known anything beforehand about how we will 
choose. A can certainly know he will choose as B would have chosen, or 
as C wants him to, or in a way that he will regret. He can know this if 
these aren’t themselves his options (if he is not deliberating whether to 
choose as B would, or as C would, or as D would) and if he can’t infer 
from this knowledge which of his options he will choose. What he cannot 
know is that he will choose o”. He might properly predict his choices 
under many descriptions. All that is ruled out 1s his foreknowing his 
choices in the terms of his options. 

Also, the above has little to say on the foreknowledge of actions. Some 
authors (for instance, Hampshire [1959], chs. 2 and 3) argue that we can’t 
know beforehand what we will do, unless we already intend to do it and 
know we have this intention. Our result here does not support this, at 
least not in general. To the extent that what we will do depends on how we 
will choose, the limits on the foreknowledge of our choices restrict the 
foreknowledge of our actions. But we can act without having chosen. 
(I have just scratched my nose. I faced no issue and made no choice: it 
itched, so I scratched.) The fact that actions involve intentions (I wanted 
to scratch my nose) does not go against this. An action can be intentional 
and yet not derive from a choice. So the problem of foreknowing our 
choices does not (in each case) carry over to our actions. 

Still, we can underplay it too far. Granted there is no paradox here. A 
may nonetheless be troubled, at least where a d. t-indefensible set is not 
single-membered. He cannot properly believe all the propositions (1)-(5), 
nor all of (13)-(16) nor both (5) and (6). Yet each of these propositions 
singly may be credible enough for him. Which of them should he reject? 
There is no general answer to this. A will have to find his way out of each 
quandary as it comes. 


2 Shackle introduces his third problem as follows. Suppose the agent 


... can discern no pattern of association between act and sequel, so that it 
appears to him that any sequel . . . can follow any act. Then he will see no purpose 
to be served by choosing amongst acts. In such a case I shall say that decision is 
powerless ([1966], p. 74). 
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What there is problematical here is harder to see than with the rest. 
The problem of tlluston seemed to be that we have no choices whatever. 
The problem of empty choices is that we can’t have full self-knowledge. 
We want to think we have choices, and it may unsettle our pride to think 
that full self-knowledge can’t be had. But nothing at all is denied us here. 

A familiar analysis moreover suggests itself in this context. If every 
option has the same set of possible sequels, each sequel still need not 
follow every option with the same probability. Suppose that the probabili- 
ties differ. If the agent is not indifferent as to which of the sequels comes 
out, the expected utilities of his options may differ, and choice might 
make sense for him after all. 

Shackle rejects this reasoning. In the case that he describes, ‘. . . foresight 
would ... be at the opposite extreme from perfect, . . . uncertainty would 
be unbounded’ ([1965], pp. 4-5). He sees all cases involving uncertainty 
in a way that rules out probabilities, and so also rules out expected 
utilities. This of course does leave him hanging. But how does he come to 
it? 

The core of Shackle’s position here is his concept of sequels. On his 
view, these are not the actual causal consequences of the options the 
agent has, nor their conditional consequences in this contingency or that. 
Rather, they are what the agent thinks would causally follow his options. 
The agent may suppose that there are n contingencies, so that 0, would 
be followed either by s4; or by sy. or by . . . Sin that 0, would be followed 
by sas or by Soa or by . . . Sgn, etc. He may also suppose the number of con- 
tingencies to be infinite. Shackle assumes that an agent facing an issue 
sees things in neither of these ways—he calls this the assumption of 
bounded uncertainty. He holds that the agent thinks the contingencies are 
finite, but that he cannot assign any specific number to them. 

This new assumption takes a stand on how a person sees things. How 
can we generalise a priori about what people think? In general, of course, 
we can’t, but Shackle insists we must make an exception for what this 
assumption says. He points out that contingencies are not actual states or 
conditions of some sort. The physical eye can’t see them. Contingencies 
are possibilities, and can only be imagined. This now connects with 
Shackle’s own version of indeterminism: Shackle holds that a person’s 
imaginings are outside the scope of causality. The agent is creative here. 
Suppose he is also self-aware, that he knows his thinking is free. However 
many contingencies he notes, he knows he could have brought out more. 
He knows that the number of possibilities—the number he might have 
noted—is finite but can’t be fixed. The principle of bounded uncertainty 
says just this. 
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I have already had my say on determinism and freedom. We needn’t 
reject determinism to grant that someone could have thought differently. 
But this is a side-issue here. The question is: suppose we accept Shackle’s 
assumption—what difference does it make? Shackle argues that it rules 
out the probabilistic measurement of the contingencies. He notes two > 
kinds of probability. Frequency probabilities are out of place, for all 
contingencies are unique, and we can only speak of the frequency of 
repeatable types of conditions. Logical probabilities are out too. These 
are ratios of possibilities, and call for the possibilities to be finite and 
all of them equally possible, or at least comparable as to their relative 
possibility. The assumption of bounded uncertainty now keeps us from 
knowing that this holds. (Strictly: only the part on uncertainty comes in 
here.) Where the agent knows that some (or all) of the contingencies he 
sees are unions of narrower possibilities, but can’t say (since he can’t 
number them) how many items are included in each union, he can’t know 
that the contingencies he sees are all equally possible. Nor can he know 
that this one 1s twice or three times as possible as that. So he has no proper 
basis for any distribution of probabilities. Shackle concludes that probabili- 
ties are inapplicable altogether. This means of course that expected-utility 
analyses can’t get started. 

Shackle goes on to work out a logic based on a non-probabilistic measure 
of what he calls potential surprise. I will not follow him further. (Those who 
want to continue should see Levi [1966] and [1972] on surprise and also 
Watkins [1955] for the full theory.) The argument rejecting probabilities 
seems to me to be thin. Shackle ignores subjective probabilities (the 
Ramsey—deFinetti-Savage sort). I see no reason for doubting that a 
subjectivist analysis applies. If it does apply, expected utilities are back 
in the picture. 

Still, two questions have been raised. Clearly we often choose in 
situations in which we have no determinate probabilities. The choice 
problems here are usually labelled problems under uncertainty. Where and 
how do these problems arise? And what sort of logic of choice is appropri- 
ate? I shall set out answers to these two questions below. The answer to 
the first appears in an extension of Ramsey’s [1931] analysis, or rather in 
an extension of a revised form of it. The second answer derives from an 
analysis of Levi's [1974]. 

I start out with the concept of preference. Two initial definitions here. 
If a person neither prefers v to w nor prefers w to v, he is indifferent as to 
v and w. If he is indifferent as to v and w and moreover prefers v to every 
proposition to which he prefers w and also prefers to v every proposition 
that he prefers to w, and if the converses also hold, then he eguivalues 
v and w. Equivaluation implies indifference, but not the other way. 
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The next part is more ambitious. I want to show that the concept of 
utilities can be spelled out in preference-terms, and that of probabilities 
in terms of utilities. Suppose that a person prefers some proposition to 
every other proposition, or that there is some set of propositions to none 
of which he prefers any other, none of them also being indifferent to any 
other to which some proposition is preferred. Call this topmost proposition, 
or any proposition in the topmost set, «. Suppose also that there is some 
proposition to which this person prefers all others, or some set of 
propositions none of which he prefers to any other, none of them also 
being indifferent to any other that is preferred to some proposition. Call 
this bottommost proposition, or any proposition in the bottommost set, w. 
To rule out all these propositions being indifferent to each other, assume 
that « is preferred to w. Suppose finally the existence of some proposition 
N such that the agent equivalues the propositions v if N, w if not and 
w if N, v if not, whatever v and w, and ranks these two equivalued proposi- 
tions between v and w. (‘To pin this down, take N to be the toss will be heads 
or the card drawn will be red.) 

One last preliminary. We shall speak of the a-to-w spectrum. This is a 
set of propositions generated from « and w in a certain way: « and w are 
in the spectrum and also « if N, w if not. Call this third proposition p. 
. The spectrum then takes in « if N, p if not and p tf N, wif not. Call these 
propositions 7 and o. The spectrum has æ if N, 7 if not and y tf N, p if not 
and u if N, o if not and o if N, w if not. On the same filling-in principle, 
we add eight further propositions, etc. 

We now assign utilities recursively. First, we assign utilities to « and to w— 
any numbers will do, provided only that the number assigned to « is the 
higher. We go on to say that a proposition in the spectrum has a utility 
that is half the sum of the utilities of the two propositions generating it. 
The utility of « if N, w tf not is thus midway between the utilities of 
æ and w. This mid-ranked proposition is u. The utility of « if N, u tf not— 
this proposition is 7—is midway between the utilities of « and u. The 
utility of 7 if N, p if not is midway between the utilities of 7 and p, etc. 
This frxes utilities for all propositions in the spectrum. If for some 
proposition © that is not in the spectrum there is some proposition in the 
spectrum that the agent equivalues to v, the utility of this equivalued 
proposition is the utility of v. Or rather, it is the utility of v given our 
choice of the origin and the unit of the utility scale, this being what our 
initial valuations of « and w provided. 

Probabilities follow directly. Read p(v,w) as the probability the agent 
would assign to v if he believed w, or the probability of v conditional upon 
w—for brevity’s sake, the w-probability of v. Read u(w) as the utility 
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the agent assigns to w. Conditional probabilities can be defined as 
follows: 


u(w)—u(not-o & w) 
u(v & w)—u(not-v & w) 
given only that u(o & w) # u(not-v & w). 

Consider the special case of p(v, w) in which w is some logical truth 
(say, s-or-not-a). This gives us the agent’s probabilities unqualified. Read 
P(v) as the plain probability the agent assigns (unconditionally) to v. The 
suggestion is that p(v) is short for p(v, T) where T is the truth involved. 
Putting T for w in (17) and simplifying, we get 


(18) roe u(T)—u(not-v0) 


u(v)—u(not-v) 
All logical truths are the same (vacuous, null) proposition. So the utility 
of all such truths is the same, and it makes no difference what we take as T. 
To get some perspective here, consider the familiar principle: 


(19) u(w) = p(v, w)u(v & w)+ p(not-v, w)u(not-v & w) 


This says that the utility of w is the weighted average of the utilities of 
its coming out true along with v and along with not-v—the weights being 
the w-probabilities of v and not-v. It says that the utility of a proposition 
is a certain conditional-probability-weighted average utility. If we let 
p(not-v, w) = 1—p(v, w), (19) follows from (17), and also conversely where 
u(o & w) + u(not-v & w). The probabilities of anyone of whom (19) is true 
are related to his utilities as in (17). 

We see that where a preference structure establishes (origin-and-unit- 
relative) point-utilities for every proposition, it also establishes a (unique) 
conditional point-probability for every proposition relative to every 
(logically contingent) other. It also establishes a (unique) nonconditional 
point-probability for every proposition. That is, where it establishes 
determinate utilities all around, it also establishes determinate probabilities 
all around. On the Ramseyan analysis, every proposition always has a 
determinate utility, and so each also always has determinate probabilities. 

But establishing determinate utilities involves one special assumption: 
that for every proposition not in a given spectrum there is always some 
proposition in that spectrum equivalued to it. This assumption need not 
hold. Suppose that in fact it is false, that some proposition v that is not 
in the a-to-w spectrum is not equivalued to any proposition in this 
spectrum. Suppose also that there are two propositions ¢ and W in the 
spectrum, ¢ preferred to #, such that the agent is indifferent both between 
v and ¢ and between v and y. Also that he prefers v to all those and only 


(17) p(v, w) = 
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those propositions to which he prefers ys and that he prefers to v all those 
and only those propositions that he prefers to ¢. Here we can’t speak of the 
point-utility of v, but can speak of its utility range: we can say that the 
utility of v ranges between the utilities assigned to ¢ and #. That is, the 
utility of v is here indeterminate within these limits. 

Likewise for indeterminate probabilities. Where a person’s utilities are 
not all fully determinate, neither are his probabilities. We can say in 
general that a proposition has any probability established for it via our 
definitions by some selection of point-utilities from within the utility 
ranges of certain other propositions. That is, where either u(w) or u(v & w) 
or u(not-v & w) are indeterminate, (17) does not establish a determinate 
p(w, w), and so also for (18) and p(v). But any selection of utilities from 
within the utility ranges gives us a corresponding probability. So there 
are now probability ranges: there are upper and lower bounds to the 
probabilities of v. The probabilities are indeterminate within these limits. 

This calls for generalisations of (17) and (18). Ranges can be set out as 
pairs of numbers, or two-item vectors giving the upper and lower bounds 
of a range. Let &(w) and u(w) be the upper and lower bounds of the utility 
range of w, and f(v, w) and p(v, w) the upper and lower bounds of the 
w-probability range of v. Our generalisation of (17) is 

aa ion Ao | ay ns & w) = u(zo)-U(not-v & w) | 

a u(v & w)-ii(not-v & sw) üv & t}-u(not-v & w) 


Mt 


given only that u(v & w) # ü(not-v & w) and dv & w) # u(not-v & w). In 
line with the above, let p(w) be p(v, T) and p(v) be p(v, T). Our generalisa- 


tion of (18) is then 
(21) {fo}, p(0)] = pone ae tote | 
u(u)-(not-v) 4(v)-u(not-v) 
Where the upper and lower bounds of the utility ranges coincide, (20) 
reduces to (17), and (21) to (18). 

Ramsey works out point-utilities and point-probabilities only. We go 
further and work out ranges. We do this by allowing for a proposition’s 
being indifferent to each of several non-indifferent others: v may be 
indifferent both to B and to y even where B is preferred to y. This means 
that we provide for a person’s indifferences not being transitive. Ramsey’s 
analysis assumes transitivity ([1931], p. 179). Our analysis does not. Of 
course, full transitivity is always possible. In that case, the ranges shrink 
into points. The situation that Ramsey considers is a special case of ours. 

We now have an answer to our first question. Probabilistic uncertainty 
arises where there is utility-uncertainty, and this holds where the agent’s 
indifferences are not always transitive. We might put it another way. Note 


250 Frederic Schick 


that equivaluations must be transitive. (This by our definition of equi- 
valuation.) If a person’s indifferences are transitive, all his indifferences 
are equivaluations. Let us refer to any indifference that is not an equi- 
valuation as a mere indifference. Where a person is merely indifferent in 
whatever regard, his position, as a whole, might be described as vague. 
So it is vagueness that makes for uncertainty. 

What kind of a logic of choice applies in contexts of uncertainty? Levi's 
discussion shows how expected utilities still can be used. I define the 
expected utility of an option o as follows: 


(22) eu(o) = PCr, 0)u(S1) + P(Ca ojulsa) t +  « PCen 0}u(s,) 


where the c’s are the contingencies the agent sees and the s’s are the sequels 
of o in these contingencies. Notice that (22) is not a generalisation of (19). 
The sequel s, is the upshot of the agent’s taking option o in contingency cy, 
all that he thinks will causally follow his taking o where c, holds. He need 
not equivalue s; and c,-and-o. So u(s,) need not be the same as u(c, & o), 
etc. (19) is a corollary of our definition (17). (22) defines a new concept. 

We have seen that where a person does not assign a proposition some 
point-probability, he does assign it some probability range. It follows that 
though his position can’t always be laid out in terms of some single 
probability function—some single comprehensive point-probability assign- 
ment—it can always be represented by some set P of such functions. A 
person’s P contains all the probability functions compatible with his 
probability ranges, that is, all functions determinable by some conjointly 
possible constrictions of all these ranges to points. Suppose that a person’s 
position is maximally specific, that he assigns some point-probability to 
each proposition conditionally on every other. His P here is single- 
membered. Suppose that his position is maximally vague, that he assigns 
to each proposition only the full probability range from zero to one. Here 
his P is the set of all possible probability functions. Most actual positions 
fall somewhere between these extremes. 

Let the agent’s utilities for the sequels of each of his options in each 
of the contingencies he notes be determinate. We can say that a rational 
person would choose some option that has the greatest expected utility 
relative to one of the probability functions (any one) in his P. Where there 
is no uncertainty, all the functions in P assign the same probabilities to 
the contingencies noted, and the general policy reduces to the rule for 
choice under risk. Notice that, in other cases, it may provide for keeping 
an issue open to some extent. In some cases, it may even give us carte 
blanche. Each option may come out best relative to some function in P, so 
any choice might do. 
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Now to generalise this: the agent’s utilities may be of any degree of 
specificity. We proceed as we did with probabilities. A person’s utility 
position can always be represented in terms of some (origin-and-unit- 
relative) set U of utility functions. His set U contains all the utility 
functions compatible with his utility ranges, all the functions determinable 
by some constriction of all these ranges to points. The special case of full 
determinancy is trivial—this is the case in which the’ utilities of all 
propositions are determinate, not just (as above) those of certain sequels. 
Here all the ranges are points to begin with, and U contains one function 
only. (In this case, P too is single-membered.) In more typical cases, U 
(and so also P) is a manifold set of functions. A rational person keeps to 
those options that have the greatest expected utility relative to some pair 
of functions, one of them in his U, the other the matched function in his P. 

We can put it another way. Where the functions in the agent’s U don’t 
all assign the same utilities to the possible sequels of his options, these 
options don’t have determinate expected utilities for him. But each matched 
pair of functions in his U and P establishes an expected utility for each 
option, the set of all the expected utilities provided for an option delimiting 
its expected-utility range. Label the expected utilities established for a set 
of options by the same matched functions in U and P corresponding expected 
utilities. Our choice policy says that a rational person will choose an 
option that has an establishable expected utility—an expected utility in 
the range fixed for it—at least as great as the corresponding expected 
utility of any of the others. He will choose an option whose expected-utility 
range is not wholly below that of any alternative option. 

This is my answer to the question of the sort of logic of choice that 
applies. (Again, the answer is basically Levi’s.) There is one further matter. 
Where there is uncertainty, several of the expected-utility ranges may 
overlap, and so it may be that several options would pass. This recalls 
Shackle’s idea of powerless choices. Indeed it goes beyond Shackle’s case. 
It brings out that people face powerlessness (better: weakness or in- 
conclusiveness) in situations far less extreme than Shackle’s, 

The usual response to this is to adopt supplementary logics. The 
familiar principles meant for the purpose have been devised for cases in 
which the utilities of all the sequels are determinate. These principles 
could now be expanded. The agent might be assumed to focus on only the 
upper and lower limits of the sequels’ utility ranges. Perhaps he always 
chooses one of the still-live options all of whose sequels have minimum 
utilities higher (or no lower) than that of any sequel of any other residual 
option. This new secondary policy would often close the issue. ‘The trouble 
is that neither it nor its variants has any claim to being thought rational. 
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(Samuelson has written somewhere that minimaxing does not make for 
rationality, but for paranoia.) 

What other line is possible? As I see it, the expected utility range- 
overlaps do bring out a weakness, but not a weakness in ourselves. The 
finger points to rationality: there is only so much that rationality can do. 
There are limits to how far it can take us. Where we are not content to 
consider every residual option acceptable, we must look to principles of 
some extra-rational sort. But this dark suggestion won’t be pressed here. 


Rutgers University 
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Discussions 


MILLER’S SO-CALLED PARADOX OF INFORMATION 


The Paradox 

Fallacies involving Substitution for Bound Variables 
Paradox Lost 

A Consistency Argument for the Straight Rule 
Other ‘Paradoxes in Probability Theory 

Functions and Numbers 
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KE THE PARADOX 


It is a familiar feature of the probability distributions in science that they 
depend on, and are often continuous functions of, a set of parameters 0,,... 4 
which have frequently some simple physical significance, characteristic of the 
conditions producing the distribution. The probability, for example, of m 
emissions from a decaying atom in time ¢ is for fixed m and t a function of one 
parameter A called the rate parameter which gives the expected number of 
emissions per unit time for that atom—the probability distribution is that 
characteristic of a Poisson process, defined by 


P(X, = m) = e-At(At)™/m!. 
In general, the situation is summarised by the formula 


P(A) = f(8,, .. . 505). 

What is commonly done by Bayesians and others is partially to define a new 
probability measure, a logical, or in general a priori conditional measure P*, 
according to the principle 

P*(A/@ = (0), ... 0,)) = P(A) = f(804 ... 4) . (x) 
where @ is a k-dimensional variate. The significance of this principle consists in 
its permitting an evaluation of the posterior probability P*(@e d9/E), where E 
describes some statistical sample, according to Bayes’s formula P#(8 e d OJE) = 
a P*(E/@ e d 8) P*( 6 ed Ü) where a is a known constant, and the prior distribution 
P*(6 e d 6) is assumed given. 

A particularly interesting case of (1) arises where we consider a toss of a coin, 
whose possible outcomes T and H occur with (physical) probabilities 1—p, 
p respectively for some p € [o, 1]. Suppose the coin is tossed n times, and con- 
sider the probability of obtaining, say, an outcome S, n of r heads and n—r tails 
in some particular order. Then by (1) 


P#(S,,n/P En D) = P(S,,n) = p#a—p ~, 
and in particular. 
P*(S, 1P =p) =? 


? See, for example, Jeffreys [1939], passim. 
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But P = P(S,,,) and so we have 


P*(S;,1/P(Si,1) = P) = P (2) 
But S,,, could equally well have been any event A defined in the outcome space 
of some stochastic experiment and (2) can thus be generalised to 


P*(A/P(A) = p) = p (3) 
Note that (3) is a direct consequence of (1) 
An ingenious attack on (3) was mounted by Miller in his [1966], pp. 59-61. 
Miller states (3) in the syntactic form (which he calls the Straight Rule): 


P(A/P(@) =r) =F (4) 
where A is the name of the statement ‘event a occurs’, P is a logical measure, 
and p is a physical, statistical measure (say, a propensity), defined on an appropri- 
ate field of events. Miller derives a contradiction from (4) as follows: 


P(A/'p(@) = 1/2’) = 1/2 (5), from (4) 

P(A|'p(a) = p{4)’) = (4) (6), from (4) 

pla) = p(3) <> pla) = 1/2 - (7) 

1/2 = P(A/'p(a) = 1/2’) = P(A/'p(@) = pay) = pa) (8) 
pCa) = 1/2 = pla) (9) 


Already we have an apparent absurdity, for p is meant to be an empirical 
statistical measure which depends for its value on the state of the world, whereas 
we seem to have proved from the a priori stipulation (4) that it must take the 
value 1/2 at an arbitrary event a. Repeating the above type of-argument with 
the two substitution instances, ‘p(a) = 2/3’ and ‘p(a) == 2p(d)’ it is similarly 
possible to prove that p(a) = 2/3. It certainly appears as though a contradiction 
has been derived from (4) by means of logically valid inferences. However, we 
believe that, properly formulated, the straight rule is immune to this derivation. 
If both the logical probability measure P and the empirical measure p are taken to 
be defined on a field of propositions (represented as sets of ‘possible worlds’) 
then it can be shown that Miller’s proof of a contradiction involves a fallacy to 
which attention will now be drawn. 


2 FALLACIES INVOLVING SUBSTITUTION FOR BOUND VARIABLES 


An important class of fallacies result from insufficient care in distinguishing 
bound from free variables. For example, we cannot, in number theory, infer 
from 
Ys Ay (æ <y) (x) 
the conclusion 
dy (y <3) (2) 
In (1) any term can be substituted for x except one that involves the bound 
variable, y (free). In the language of modern logic y is not free for x in (1).1 
Variables may be bound not only by quantifiers, but also by other types of 
functionals (operators): for example, summation operators, integral operators, 
1 See Mendelson [1966], p. 48. 
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and inter alta set and function abstraction operators. Fallacies analogous to (2) 
can be perpetrated with respect to all these. Thus the equation 


[fle 3) de = F (x9 (3) 
can not be inferred from 
L F6.) de Fo (4) 
since x is not free for y in (4): x is bound by the operator 
0 


Again, consider Church’s abstraction operator A.1 A term like ‘xx’ determines a 
unique number when x is assigned a particular value, n. ‘Ax[x-++-x]’, on the other 
hand, denotes the one argument function, which assigns to each number n the 
value that ‘x-+x’ determines when n is assigned to x. Let the second order 
functional F be defined for all one argument first order functions, as follows: 
(a) For any k, F(Ax[x-+-k]) = k 

(b) For all other first order functions f, F(f) = o. 

Then from (a) it would again be fallacious to infer 


F(Ax[x--2]) = x3 (5) 
for from (5) and (b) we immediately infer the false statement 
Vex = 0) (6) 


The wrong move in (5) and (3) is essentially the same: substituting a term not 
free for the replaced variable. 

These errors are rather obvious when all variables involved are made explicit 
in the formalism. However, it sometimes happens that certain variables are 
suppressed for conciseness of presentation, and in such cases fallacies like these 
may slip past the most vigilant guard. It is our contention that there is an 
extremely natural reading of the straight rule which involves suppressed 
variables, and that Miller’s proof of a contradiction violates the above restrictions 
on substitutions when the suppressed variable is made explicit. 


3 PARADOX LOST 
Miller derives a contradiction from the straight rule as formulated in section (1). 
Vrelo, 1] (P(4l'p(a) = r) = 7) (1) 


However, there is an apparent difficulty in formulating the rule in this manner. 
It arises from the fact that the quantifier in (1) binds free occurrences of the 
variable r in 


P(Al Pa) =r) =r (2) 


1 In fact all operators binding variables can be reduced to a combination of -abstraction 
together with the application of some higher order function to an argument, In this 
way all the examples cited become instances of one fallacy. 
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But there is only one such occurrence: r does not appear as a variable in the 
left-hand side of (2) but merely as a letter of the alphabet used in constructing 
a metalinguistic name of the formula ‘p(a) = r’. The quantifier cannot ‘break 
through’ the quotation marks. Consequently (1) yields a contradiction by two 
simple substitution instances. 

P(A|'p(a) = r’) = 1/2 (3) 

P(A|'p(@) = r’) = 1/3 (4) 
It may be possible to restate the rule in a way that avoids this objection but 
which leaves it vulnerable to Miller’s derivation in section (1). However, it is 
our claim that it can be stated in such a way that renders it free of both these 
faults. 

Our analysis proceeds from the observation that is an empirical measure 
while P is a logical, or a priort, measure. Let us consider the general form of (1) 
with the measures defined on the same field, which we shall provisionally identify 
as a field of propositions. Then (1) becomes 


Vrelo, 1](P(A|p(4) = r) = 7) (5) 


This formulation certainly avoids the problem generated by quotation-mark 
names, but it is not immediately clear that (5) is well-formed. For it appears 
that ‘4’ is of a different type, or level, than ‘p(4) = 7’, and it may appear that 
this difference of type carries over to the denoted propositions. However, the 
second argument in (5) bears an obvious analogy to a random variable set at 
a particular value. Indeed it can be interpreted in just this way. 

Suppose that P’ is any measure defined on a o-field generated from some set 
of elements, S. It often happens, in the standard theory of probability, that 
certain functions, random variables, are defined on S, and that P’ assigns a 
value to the set on which the function takes a certain value. Thus if X is a 
random variable the measure of the set on which X takes the value r is given by 
P'(X = r); this is an abbreviation of the formula ‘P’({s:X(s) = r})’. Similarly 
a conditional probability involving two random variables X and Y may be 
defined yielding, say: 


P(X = x| Y = y) = f(x, y) (6) 
(where f(x, y) is a function of x and y). In (6) the variable ranging over the 
elements of S is suppressed, but it is there implicitly. ‘X = æ’ is just an 
abbreviation for ‘{s:X(s) = x}. A particular instance of (6) might be 


P(X = 1| Y = y) = y for ye [o, 1] (7) 

Now (7) is very nearly of the form of (5), and (5) can indeed be cast in this form. 
Suppose that p and P are measures on a o-field generated by a set of possible 
worlds—each possible world is a realisation of some appropriate language as 
a definite physical structure, in which, among other things, statistical probabilities 
and propensities are determined. It is well known that propositions of the 
language can be represented by the sets of worlds in which they are true. Now P, 
being an a priori or logical measure simply associates each pair of propositions 
at which it is defined, with a real number. p, on the other hand, is an empirical 
measure, and it is clear that the value it associates with a proposition depends 
on what the structure of the world is like. That is just to say that in different 
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worlds p assigns different numbers to one and the same proposition. It is easy 
to see that if the world is one type of structure the propensity of a certain event 
may be 1/2, if another, 1/3. Thus we have a family of measures, pa, one for 
each possible structure, w. 

Consider a very simple logical space generated by an event (which has two 
possible outcomes) and the propensity of one outcome of the event to occur. 
(The event could be the toss of a coin which yields either heads or tails.) Let 
one outcome be H and the other T. Then each ‘world’ can be represented by 
an ordered couple the first member of which is either H or T, and the second 
of which is a real number between o and 1. The first member indicates the 
outcome of the event and the second the propensity of the event H to occur in 
that world. Consider the two random variables defined: 


o if w = <T,r> for some relo,r] 
Apo) = . — 
1 if w = <H,r> for some relo,r] (8) 
Yalow) = rif w = <H,r) or w = Tr) 
The proposition that heads occurs is represented by the set of worlds in which 
the outcome is heads, {w: Xy (w) — 1}, abbreviated to ‘Xy = 1’; and the 
proposition that the propensity of heads to occur is r, by the set {w: Yy (w) = 7}, 
abbreviated to ‘Yy =r. 

What can be done in this simple case is quite general. Let us take the field on 
which the measures P and pẹ are defined, to be a field of subsets of the set of 
all possible worlds, and define, for an arbitrary proposition Æ the random 
variables: 

x 1 if A is true in w 
Ae) o otherwise (9) 
Y (w) =r iff p,(A) =r. 
Now (5) can be written in the formally unobjectionable form 


Vrelo, 1] (P({eo: X (u) = x}| {w: Ya) =7}) = 1) (x0) 
which can be abbreviated to 
Vre[o,1] (P(X, = 1| Y, =r) =7) (11) 


Hence the principle which Miller reduces to absurdity does have a prima facie 
formulation in what has become the standard representation of probability 
theory. It is now possible to evaluate Miller’s derivation. Firstly note that step (3) 
of Miller’s argument, in the abbreviated form, is: 


P(X, = 1Y, = Va) = Ya (12) 
Secondly, for step (4) to be valid, that is, 
Y, = Yao Y, = 1/2 (13) 


we must take it that ‘Y, = Y7 is an abbreviation for ‘{w: Y (w) = Y(w)} — 
that is, that ‘Y7(w)’ has been substituted for the bound variable in (10). But then 
it is apparent that (12) involves a fallacy of the type mentioned in section 2. 
On expanding (12) this is made explicit. 


P({w: X ,(w) = 1}| {w: Y (w) = Y7(w)}) = Yaw) (14) 
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(14) is tantamount to the proposition that the propensity of Ais 1/2; Yz = 1/2. 
When (12) is properly analysed it is apparent that the variable w is bound by 
the set abstraction operator on the left-hand side of the equation and free on the 
right. Such a substitution instance is clearly illegitimate for purely logical 
reasons. 


4 A CONSISTENCY ARGUMENT FOR THE STRAIGHT RULE 


Earlier we considered an experiment consisting of tossing a coin times. Here 
we have a sequence of Bernoulli trials, parameter pe[o,1]. The Straight Rule is 
equivalent to 


P(H,/p(H) = p) = p (x) 
where H; is the proposition that H occurs at the sth trial 1 <1 <n. 

Now consider a very simple first order language characterised by one unary 
predicate B and n individual constants a,,..., a,. Let C be the sentence which 
asserts that there are exactly # distinct individuals 

C= A (a, 4 a)AVx(Vx=a,) 
1<i<j<n i 

Now let us form the conditional logical probability 

P(r, n) = P(Ba,/C A (RF = r/n)) 
where (RF = r/n) is the sentence formed by the disjunction of all conjunctions 

+Ba, À Bag A... A Ba, 
containing exactly r unnegated Bs; hence (RF = rfn) A C asserts that the 
relative frequency of Bs in the universe is r/n. P’s distinctively logical character 
can be expressed in the condition that P(A/B) measure the proportion of B’s 
models (factored say by isomorphism) which are models of A1 P(r, n) is then 
easily computed to be 

Ca C, = rin 
Hence, defining P, == P(.../..C..) we get, since n can take arbitrarily large 
values, 
P{Ba,lRF = p) = p (2) 

for every rational p in [o, 1]. 

Both Popper’s propensity and von Mises’s frequency theory evaluate p as 
approximately the long-run relative frequency of the character in question, and 
so (granted this method) we can write (1) in the form 

P(H,/long run relative frequency of H = p) = b. 
But the transformation HiB, it+a,, pr+p yields (2) and so the consistency of (x) 
reduces to that of the classical theory of probability, for the logical measure was 
computed essentially according to the classical ratio of numbers of favourable 
to possible cases. And the classical theory is trivially consistent. Hence the 
straight rule is a consistent method of evaluating conditional probabilities 


P(A/p(A) =r). 


1 This condition was investigated in its bearing on the inductive or other character of 
logical measures in Howson [1975]. 
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5 OTHER ‘PARADOXES’ IN PROBABILITY THEORY 


‘Paradoxes’, completely analogous to the one produced by Miller, are very easy 
to come by in the theory of probability whenever empirical magnitudes occur 
in probabilistic contexts and the logical form of such statements is not made 
explicit. There is nothing special about the straight rule and the relation it 
postulates between logical and physical probabilities. Consider for example, a 
roulette wheel, divided evenly into ten segments labelled 1,..., 10. Let ¢ 
be the number of the segment in which a ball lands when the wheel is spun 
randomly. Supposing the wheel to be bias-free the probability that the ball 
will land in any particular segment is 1/10. We could represent this assumption 
thus: 


(x) P(d <1) = 1/10 I Li < 10. 


(x) does not, as it stands, presuppose any particular account of probability 
or the entities on which probability functions are defined. By one substitution, 
analogous to that of step (6) in Miller’s argument, we get: 


(2) P(E < $) = G/t0 
But the probability that ¢ < ¢ is clearly maximal, 1. So we get: 
(4) $= 10 


From the stipulation (1) that each segment is equally likely to receive the ball 
we have proved that it must land in the tenth segment! However we could 
equally well have drawn from our assumption, instead of (1): 

(5) P(¢d <i) = (#1)/to I< KIO 

And by the substitution of ¢ for t: 

(6) P($ < $) = (¢-1)/10 

But since the probability that ¢ < ¢ must be zero, we get: 


(7) d= 1. 
Thus the ball must land in the first segment. 

The ‘paradox’ here is equally well dissolved by the recognition that ¢ is 
a function on an outcome space—it associates possible outcomes with numbers. 
Thus (1) should be written: 


(x)’ P(e: (w) S4}) = 1/10 
The substitutions in (2) and (6) would fail for exactly the same reasons as step (6) 
of Miller’s argument. 


6 FUNCTIONS AND NUMBERS 


Richard Jeffrey has recognised that what makes step (3) of Miller’s argument 
invalid is the substitution of a so-called ‘factual designator’ of numbers.) But 
there has, as yet, been no clear account of the mathematical reason why such a 
factual designator cannot be substituted for the variable r in (4). The core of 


1 See Jeffrey [1970] and [1975]. For full bibliography see Jeffrey’s review of the literature 
on Miller’s paradox in Jeffrey [1970]. 
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our argument is that such a designator does not pick out a particular number, 
but a function from worlds to numbers.! Miller discusses related views in an 
article entitled ‘What’s in a Numeral’? and in one paragraph he dismisses the 
function account as ‘merely evasive’. The reasons for this abrupt dismissal are 
not entirely clear, but Miller does note that the English language expression 
(to change his example) ‘The propensity of the toss to yield heads is 1/2’ does 
not contain a possible-world variable and he hints that it is not legitimate to 
maintain that one occurs there implicitly. But natural languages are notorious 
for omission of explicit reference to bound variables, whether the variables 
range over individuals, numbers, or, as in this case, possible situations. Consider, 
for example, the expression ‘Everybody is beautiful’. No-one has qualms about 
an analysis of this sentence which makes explicit the universally quantified 
variable—this was, in fact, one of the main achievements of modern logic. The 
fact that the function analysis makes explicit unmentioned variables is not a 
criticism of the account at all. 

However, Miller has another argument which is developed in the rest of his 
paper. Our reason for saying that ‘the propensity of the toss to yield heads’ 
picks out a function and not a number is that in some worlds the propensity 
takes the value 1/2, and in others, 1/3. Now Miller claims, in effect, that this 
also shows that ‘1/2’ picks out a function—the function which in some worlds 
has the value, the propensity of the toss to yield heads, and in other worlds 
lacks this value. Miller does not say outright that ‘1/2’ picks out a function from 
worlds to numbers, for this would fly in the face of the universal convention that 
‘1/2’ shall denote a number, the number 1/2. But an insight into what Miller 
is getting at is afforded by a thesis argued for later in the paper. There Miller 
claims that we can regard ‘the propensity of the toss to yield heads’ as designating 
a single number, but a number which changes as we go from world to world. 
Similarly, ‘the number of people on earth’ does not designate a function which 
(evaluated at a world) takes time points to numbers, but rather a number which 
changes through time. ‘Changing numbers’, that is, quantities which depend on 
certain parameters, have always been analysed as funcitons, and it is quite clear 
that functions will play the role of Miller’s changing numbers without the 
suspect metaphysics. 

This thesis, though obscure as Miller himself admits,’ requires some rebuttal 
if our analysis of the paradox is to go through. But it is clear that Miller’s thesis 
yields patently unacceptable results if we apply it to other designators in an 
entirely analogous manner. Consider, for example, Miss World. Now it is well 
known that Miss World is a different person each year, say Jane Smith (of Brazil) 


1 Tich} has enunciated and argued forcefully for the theory that all physical magnitudes 
are to be analysed as such functions—similarly, all individual concepts (like that of the 
president of the United States) should be analysed as functions from possible situations 
to individuals. Thus the analysis we propose here would follow as a matter of course 
from TichŸ’s account. See Tichy [rg71] especially pp. 288-9 where very similar 
‘paradoxes’ are analysed. See also Tichy [1978]. In Tichy’s A-notation the principle (4) 
is constructed Atw.Vrelo,1]P(A|Aw.p (A) = r) = r. Substitution of p,,(A) for r is illegiti- 
mate, the variable w being bound again by the second abstraction operator within the 
brackets. The construction which results from such a substitution is equivalent to 
ÀW. D(A) = 1/2, i.e. to what is expressed by “The propensity of À is 1/2’. 

2 Miller, D. W. [1978]. 

* See Miller [1978], pp. 74-5. 
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in 1976, and Mary Jones (of Iceland) in 1977 (with respective mothers, Mrs 
Smith and Mrs Jones). But according to Miller’s proposal we could equally 
regard Miss World as having always been the same person who changes through 
time. It will follow from this that there is a person, namely Miss World, who 
has a different mother each year. That there are several mothers who have given 
birth to one and the same child, Miss World, seems to be a rather embarrassing 
biological consequence of Miller’s thesis together with a couple of universally 
uncontested facts. Miller could retort that the mother of Miss World is indeed 
also a single person, but one who herself changes. One year she is Mrs Smith, and 
another she happens to be Mrs Jones. But then it would follow that each year, 
at the instant that Miss World is crowned, there is someone in the world, the 
mother of Miss World, who instantaneously changes her position—her change 
of position would be instantaneous and discontinuous. This would contradict 
the theory of relativity—one would have to add exceptions to the theory in 
order to take care of the amazing capabilities of Miss World’s mother. It is 
perhaps not logically impossible to go on revising the fundamental views we 
have on people, space and time, but it hardly seems that it is worth the trouble 
to justify a dislike for the straight rule. 


COLIN HOWSON and GRAHAM ODDIE 
London School of Economics 
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CAUCHY’S VARIABLES AND ORDERS OF THE 
INFINITELY SMALL 


In his four textbooks, published between 1821 and 1829, Cauchy often uses 
variables converging to zero, which he calls ‘infinitely small quantities’ (Cauchy 
[1821], p. 27). But what are Cauchy’s variables? 

Cauchy says: ‘One calls a quantity variable which one thinks of as having to 
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take on successively a number of values different from each other’ (Cauchy 
[x821], p. 4). This is all he has to say about the definition of variable in his 
textbooks ([1821], [1823], [1826], [1829]). Abraham Robinson suggested that 
what Cauchy meant by a variable is ‘a function whose range is numerical while 
its domain may be any ordered set without last element’ (Robinson [1966], p. 
270). John Cleave says: ‘As a first attempt at interpreting [Cauchy’s] “variable” 
in terms of non-standard analysis, we can imagine that the successive values of a 
variable are prescribed by a sequence s = {s(n): n e N} of reals (i.e. a function 
in R with domain N’ and range in RY (Cleave [1971], p. 29). According to this, 
Cauchy’s infinitely small quantities become real null sequences. 

However, we must be cautious in interpreting Cauchy’s variables as functions 
(or sequences), since Cauchy defines functions in terms of variables (Cauchy 
[1821], p. 19). Furthermore, it appears that at times Cauchy let his variables 
take on infinitesimal as well as real values, as I have argued elsewhere (Fisher 
[1978]). In fact, Cauchy does not make his concept of variable very clear. 

My purpose here is to investigate Cauchy’s concept of ‘variable decreasing 
indefinitely’ or ‘infinitely small quantity’ by examining the theory of infinitely 
small quantities he introduced in his Cours d’ Analyse of 1821. We will see in 
the process that Cauchy’s variables are not what Robinson or Cleave understand 
by functions or sequences. Cauchy develops his theory of infinitely small 
quantities further in his later textbooks (Cauchy [1823], appendix; Cauchy 
[1826], Lecture 9; Cauchy [1829], Lecture 6), and he presents some interesting 
classifications and applications. However, we will be concerned here only with 
the foundations of his theory. 

At the same time, we will see clearly how Cauchy had to give up some of the 
simplicity of actual infinitesimals when he worked with his infinitely small 
quantities. The situation is somewhat analogous to that of Copernicus whose 
theory, it is said, was at first more complicated than Ptolemy’s for many purposes. 
Some also say that from the point of view of relativity theory, Ptolemy’s theory 
is justifiable-—it is only a matter of where one puts the center of coordinates. 
Similarly, we now know from the work of Abraham Robinson and his followers 
on non-standard analysis that actual infinitesimals are justifiable, so Cauchy 
need not have given them up in the name of rigor. 

To begin with, we note that if the existence of non-zero infinitesimals is 
admitted, it is easy enough to compare their sizes. We can simply take their 
ratios. For example, given a function f we can set 


dy = f(x-++-dx)—f(x) 
for a non-zero infinitesimal dx, and take the derivative of f at x to be dy/dx. A 
modern definition of this kind is given by Martin Davis (Davis [1977], pp. 63-5). 
Naturally the case where dy/dx is finite is of special interest. Clearly this will 
happen if and only if dy = k dx for some finite k. 

However, Cauchy does not proceed directly with infinitesimals. He introduces 
instead a calculus of ‘infinitely small quantities’. In the Cours d’ Analyse, he says: 
‘One says that a variable quantity becomes infinitely small when its numerical 
value decreases indefinitely in such a way as to converge toward the limit zero’ 
(Cauchy [1821], p. 26). And again: ‘Let æ be an infinitely small quantity, that is, 
a variable whose numerical [t.e. absolute] value decreases indefinitely’ (Cauchy 
[1821], p. 27). Then, Cauchy continues, «, «3, «3, etc., are called infinitely small 
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of the first, second, third order, etc. It is notable that Cauchy does not explicitly 
require the values of these variables to be finite. 

More generally, says Cauchy, a variable quantity is infinitely small of the 
first order if its ratio with « converges to a finite limit different from zero as the 
numerical value of a decreases; infinitely small of the second order if it varies 
with æ and its ratio with « converges to such a limit, etc. Presumably Cauchy 
does not allow « to have the value zero. 

“Given this’, he continues, ‘if one denotes by & a finite quantity different 
from zero, and by « a variable number which decreases indefinitely with the 
numerical value of a, the general form of infinitely small quantities of the first 
order will be ka or at least ka(1-t-eY (Cauchy [1821], p. 28). In Cauchy’s termino- 
logy, a ‘number’ is positive, so ¢ is always positive. “The general form’, Cauchy 
continues, ‘of the infinitely small [quantities] of order 2... will be ka” or at 
least ka®(x-+-€)’ (Cauchy [1821], p. 28). (Cauchy speaks here of infiniment petits 
rather than quantités infintment petites.) 

Cauchy does not offer any proof that the general forms follow from his 
definition. Actually, there are a couple of small difficulties. These can be avoided 
by claiming the general form to be ka”(x-+e), where e may be negative as well 
as positive. Then we can supply a proof that this is indeed the most general 
form. In fact, if B = A(a) is an infinitely small quantity and fja” goes to À # o 
as a goes to o (or |a| goes to o), set e == (B/ka")—1. Then £ goes to o with q, 
and B = ka"(1-+-e). Conversely, if B == ka"(1-+e) where € goes to o with g, 
then S/a” goes to k as « goes to o (even if k = 0). 

In any case, Cauchy considers infinitely small quantities of the form ka(1-Le). 
He proves, for example: ‘If one compares two infinitely small [quantities] of 
different orders when both converge to the limit zero, that which has the higher 
order will finish by always taking on the smaller numerical value’ (Cauchy 


[1821], p. 29). 
Cauchy’s proof runs as follows: 


In fact, let ka”(1+e), k'a" (x1-e') be two infinitely small [quantities], one 


of order n and the other of arder n’, and suppose n’ > n; the ratio between 
the second of these infinitely small [quantities] and the first, namely, 


(kkj —™((1be’) (tLe) 


will converge indefinitely with « toward the limit zero; which cannot 
happen unless the numerical value of the second ends by becoming always 
less than that of the first (Cauchy [1821], p. 29). 


This is all the proof Cauchy gives, and it is not very explicit. He can perhaps 
be taken to have meant when he says ‘the numerical value of the second ends 
by becoming always less than the first’ that there is a number # such that 


[kar (t-t:e")| < Jkor(1 6) 


for all |a| < #. Whether we should take ¢ finite or not is a moot question; Cauchy 
gives no hint. In any case, given this as equivalent to Cauchy’s conclusion, we 
can fill in a proof as follows. If this conclusion were not so, there would be a 
sequence Gi, Xa ... going to zero such that 


ray (He) > |Rah(tte)| 
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for each j. Then we would have 
(k'a (1 ee’) (AAC +e)]|. >I 


for all j. Yet ay —"(k’/k)((1+e)/(1+e))| must go to zero as o, æa, . . . does, since 
n —n > 0. 

At this stage, even though Cauchy does not say we should, it is easy to interpret 
his infinitely small quantities as variables which go in some way not further 
specified to zero, while taking on only finite (real) values, t.e. without taking on 
infinitesimal values. But we see the price Cauchy had to pay. The theorem 
about infinitely small quantities we just quoted corresponds to the following 
theorem about infinitesimals: If « is an infinitesimal and mn’ > n, then an ja” is 
an ‘infinitesimal. 

Let us now examine a little how Cauchy applies his infinitely small quantities. 
A theorem on the next page from the one we quoted above says: 


Any polynomial ordered according to increasing powers of «, for example, 
a-+-bx-}-eo#+ &e ... or more generally, 


ao bat ca" Be... 


(the numbers n, n’, n”, ... forming an increasing sequence) ends by being 
always of the same sign as its first term a or ag" for very small values of « 
(Cauchy [1821], pp. 30-1). 


Cauchy’s proof is as follows. 


In fact, the sum of the second term and those which follow it is, in the 
first case, an infinitely small quantity of the first order whose numerical 
value ends by being less than that of the finite quantity a and, in the second 
case, an infinitely small quantity of order #', which ends by always having 
a numerical value less than that of an infinitely small quantity of order n 
(Cauchy [1821], pp. 31). 


It is rather curious that Cauchy says of ba+-ca?-+-&c... that its ‘numerical 
value ends by being less than that of the finite quantity a’, using the singular 
‘value’. But perhaps this can be taken as a locution for the plural ‘numerical 
values end by being less than ... a’. The second part of this theorem follows 
from the first theorem we quoted. But as we have seen, Cauchy did not really 
so much prove the earlier theorem as declare it. He cannot be said to have used 
reasoning about sequences, or about limits in an epsilon-delta manner, although 
he sometimes did use such reasoning in other contexts (see Fisher [1978]). 

A short space later, we find the following theorem: 


If, in the polynomial a+ba"’-+-ca™+&c... ordered according to 
increasing powers of æ, 2’ denotes an even number, then among the values 
of this polynomial corresponding to infinitely small values of «, that which 
corresponds to « == 0, that is, a, will always be the smallest when b is 
positive, and the greatest when 6 is negative (Cauchy [1821], p. 32). 


Cauchy adds: ‘This particular value of the polynomial, greater or smaller than 
all the neighboring values, is what one calls a maximum or minimum’. 
Now we are some distance from the language of finite real variables. It is 


Cauchy's Variables and Orders of the Infimtely Small 265 


one thing to speak of the values of a polynomial p(a) for real values of « such that 
[x] is less than some positive real number ż. But it is another thing to speak of 
values of p(a) corresponding to infinitely small values of æ. This is close to the 
language of actual infinitesimals. 

We have seen how Cauchy used his infinitely small quantities immediately 
after he introduced them. Their nature becomes no clearer later on in his 
textbooks. In the forwards to both his [1823] and his [1829], he says: 


My principal aim has been to reconcile the rigor, which I made my law 
in my Cours d’ Analyse, with the simplicity which results from the direct 
consideration of infinitely small quantities (Cauchy [1823], p. v; [1829], 
p. 1 of the reprinting in his Oeuvres). 


Cauchy does not say what he means by ‘direct consideration of infinitely small 
quantities’. Perhaps he expected his readers to know what he meant. 

We must conclude, I think, that Cauchy did not make very clear what he 
meant by his infinitely small quantities, or variables with limit zero. It is quite 
clear, however, that he did not understand them to be functions or sequences, 
in the senses in which we now take these words. 


GORDON FISHER 
James Madison University 
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I INTRODUCTION 


This paper was conceived as a reply to Fisher [1979], in which Fisher criticises 
Cauchy’s theory of orders of infinitely small quantities and the interpretation 
of Cauchy’s variables in terms of sequences as given in Robinson [1966], 
Lakatos [1978] and Cleave [1971]. The principal purpose of this paper is to 
defend two claims regarding Cauchy’s work. 

Firstly, Cauchy was just one of a long succession of mathematicians who 
founded mathematical analysis on the notion of variable quantities whose values 
are real numbers, and the derived notion of tfimtely small quantities which 
were defined as null sequences. 

Secondly, what distinguished Cauchy from all others in this tradition was 
that he counted these subsistent infinitely small quantities as being in the con- 
tinuum: they entered Cauchy’s theorems in an essential way, through his 
concept of neighbourhood. Cauchy’s alleged mistakes arise here: they are really 
correct theorems about the extended continuum. (This is the thesis advanced 
in Lakatos [1978].1) 

A proper evaluation of Cauchy’s concepts must take into account not only 
Cauchy’s own work but also the tradition he founded. In support of the first 
claim I therefore quote several nineteenth century mathematicians whose 
expositions of analysis are readily available to me. Lacroix [1810] was the only 
mathematician whose work I have been able to consult and who published 
between 1800 and 1821, the date of publication of Cauchy’s Cours d’ Analyse. 
It would be of some interest to make a wider survey of the opinions of Cauchy's 
immediate predecessors on infinitely small quantities. 

Concerning Cauchy’s use of his concept of neighbourhood, I have been unable 
to discover any follower of Cauchy who repeated this distinctive usage. 


2 VARIABLES AND LIMITS 


There was a common tradition amongst nineteenth century French mathe- 
maticians of founding mathematical analysis on the notion of a variable quantity. 
Limits of variables were then defined followed by infinitely small quantities. 
Take the following examples: 


1 Lakatos read the first version of this paper at the International Logic Colloquium, 
Hanover 1966, but he never published it. The most complete version, published only 
after his death, is the one referred to. 
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Dans la partie de l’Analyse qui va nous occuper on suppose, au contraire, 
que la quantité passe par différens états de grandeur: et on considère les 
changemens qui en résultent dans ses fonctions. Les quantités envisagées 
comme changeant de grandeur on pouvant en changer sont apelées 
variables: et on donne le nomme de constantes à celles qui conservent 
toujours la même valeur dans le cours du calcul . . . (Lacroix [1810], p. 140). 


On nomme quantité variable celle que l’on considère comme devant 
recevoir successivement plusieurs valeurs différentes les unes des autres, 
... Lorsque les valeurs successivement attribués à une même variable 
s'approchent indéfiniment d’une valeur fixe, de manière à finir par en 
différer aussi peu que l’on voudra, cette dernière est appelèe la limite de 
toutes les autres. (Cauchy [1821], p. 4. Similar definitions are repeated in 
Cauchy’s later works.) 


Lorsqu’une grandeur prend successivement des valeurs que se rapprochent 
de plus en plus de celle d’une grandeur constante, de telle sorte que la 
différence avec cette dernière puisse devenir et rester moindre que toute 
grandeur désignée, soit que la variable soit toujours au-dessous, ou toujours 
au-dessus, ou tantôt au-dessous et tantôt au-dessus de la constante, on dit 
que la première approche indéfinement de la seconde, et que celle-ci en est 
la limite. 

Ainsi nous appelons limite d'une variable quantité constante dont la 
variable approche indéfiniment sans jamais l’atteidre. (Duhamel [1860], p. 9.) 


On appelle variable une quantité que prend successivement différentes 
valeurs, et constante celle qui conserve une valeur fixe dans le cours d’un 
même calcul .... (Sturm [1863], p. x.) 


Quand les valeurs successives d’une quantité variable approchent indéfini- 
ment d’une quantité fixe et déterminée, de manière à n'en différer qu aussi 
peu qu'on voudra cette quantité fixe est appelée la imite des valeurs de la 
variable. (Op. cit., p. 3.) 


Lorsque les valeurs d’une variable x se rapprochent de plus en plus de la 
valeur d’une constante a, de manière que la valeur absolue de la différence 
x-a puisse devenir et demeurer constamment inférieure à une quantité 
donnée quelconque, on dit que la variable x a pour limite la constante a. 
(Serret [1868], p. 3.) 


On appelle variable une quantité qui n’a pas de valeur déterminée, mais 
qui peu recevoir une suite plus ou moins étendue de valeurs arbitraires. 


(Houél [1878], p. 103.) 


Soit x une variable réele qui passe successivement par une infinité de valeurs 
suivant une loi quelconque, de telle sorte qu’à chaque valeur prise par x, 
on puisse distinguer les valeurs qui précèdent de celles qui suivent et 
qu'aucune valeur de x ne soit la dernière. On dit que x tend vers une limite 
déterminée, si les valeurs successives de x se rapprochent d’une nombre 
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déterminé a, de telle sorte que la différence xa finisse par décroître en 
valeur absolue au-dessous de tout nombre positif donné e si petit qu’il soit. 
(De la Vallée-Poussin [1921], p. 11.) 


One must agree with Fisher that Cauchy's definition does not have the anti- 

septic clarity to which we are at present accustomed. Yet the basic idea of a 

variable as a kind of dynamic entity having successive values is quite clear in the 

theory and practice of Cauchy and his successors throughout the nineteenth 

century. This dynamic concept survives to this day in the way we motivate our 

definition of lim f(x) = k: f(x) tends to k and x tends to a. There appears to be 
x- 


no need to assume that there is anything more to a variable beyond the succession 
of its values to get a satisfactory interpretation of the notion—indeed this 
interpretation emerges from the work of nineteenth century mathematicians 
themselves, as we shall show later. 

Fisher claimed that Cauchy allows his variables to take on infinitesimal 
values. This, I believe, is a misreading of the entire French mathematical 
tradition. The reason that variables cannot take on infinitesimal values is that 
the successive values of a variable are real numbers (see 6, below), and infinitely 
small quantities are defined in terms of variables. 


3 INFINITELY SMALL QUANTITIES 


The book by Lacroix [1810] has a long preface, part of which discusses the nature 
of the infinite in mathematics. The text introduces infinitely small and infinitely 
large quantities by example. Lacroix considered the ratio of two polynomials 


Axt+-BrP+Cx?+... 
A'xt+B'xf + C'est... 


where a <B<y..., a’ < Pp’ <y’..., and observed that as x tends to zero 
this ratio becomes Ax*~* | A' if a> a’, AJA’ if a = g’, and AJA’ x ~t ifa <a’, 
He then explained that the infinitely small and the infinitely large are derived 
concepts which can be eliminated by analysis. 


On exprime d’une manière abrégée cette circonstance en se servant des 
dénominations d’infint et d'infiniment petit, et en établissant pour principe 
que toute puissance d’une quantité infinie disparaît devant celle d’un exposant 
plus élevé, et qu’au contraire toute puissance d'une quantité infiniment petite 
s'évanouit vis-à-vis de celle d’un exposane moindre... Ce langage, très- 
commode par sa brévité et exact quant au fond, a été le prétexte d’un grand 
nombre d’objections, parce qu’il semble attribuer une existence actuelle a 
l'infini mathématique, qui n’est, à proprement parler, qu’une idée négative, 
puisqu'on appelle infinie ou infiniment petite une quantité parvenue au 
plus haute degré de grandeur ou au dernier degré de petitesse; et on ne 
conçoit pas alors comment il peut y avoir différens ordres d’infinis ou 
d’infiniment petits, ni même qu’une quantité puisse jamais être considérée 
comme actuellement infinie ou infiniment petite. (Op. cit., p. 18.) 


In several places Cauchy defines the infinitely small and infinitely large in 
terms of limits of variables. For example: 
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On dit qu’une quantité variable devient infiniment petite, lorsque sa valeur 
numérique décroit indéfiniment de manière à converger vers la limite 
zéro. (Cauchy [1821], p. 26.) 


And 


Lorsque les valeurs numériques successives d’une même variable dé- 
croissent indéfiniment de manière à s’abbaisser au-dessous de toute nombre 
donné, cette variable devient ce qu’on nomme un imfimment petit ou une 
quantité infiniment petite. Une variable de cette bite a zéro pour limite. 
(Jbid., p. 14.) 


Similar definitions were repeated by numerous later mathematicians: 


On appelle quantité infiniment petite, ou simplement infiniment petit, toute 
grandeurs variable dont la limite est zéro. (Duhamel [1860], p. 9.) 


Lorsqu'une quantité variable prend des valeurs de plus en plus petites, de 
manière qu’elle puisse devenir moindre que toute quantité donnée, on dit 
qu’elle devient infiniment petite . .. (Sturm [1863], p. 6.) 


On nomme infiniment petit ou quantité infiniment petite, un nombre ou 
une grandeur variable qui diminue indéfiniment et s'approche autant qu’on 
veut d’ime limite nulle, sans jamais l’atteindre. (Bertrand [1864], p. 1.) 


Lorsqu’une quantite variable tend vers la limite zéro, on dit qu’elle devient 
infiniment petite: on la nomme alors un infiniment petit. 

Lorsqu'une variable croît indéfiniment, de manière à pouvoir devenir et 
à rester constamment supérieure à une quantité quelconque donnée, on dit 
qu'elle devient infiniment grande, ou simplement infime. (Serret [1868], p. 4.) 


klein 
RER sa Reihe von Werthen 


Lis Las Los Vase 
zu setzen, das deren absoluter Betrag schliesslich 45" Osser| wird und 


kleiner 
bleibt als jede Belieben ausgewählte Constant. (Worpitzky [1880], p. 1.) 


Unendlich if a heisst eine Variable x, für welche man übereingekommen 


ist eine solche 


Insbesondere wird eine Variabele “unendlich klein”, wenn sie bei stetiger 
Veränderung durch keinerlei Bedingung gehindert ist, ihrem absoluten 
Betrage nach kleiner als jede angebare Zahl zu werden, d.h. wenn sie den 
Grenzwert Null hat. (Harnack [1881], p. 16.) 


Lorsqu’une quantité variable tend ainsi vers zéro ou vers œ, on dit qu’elle 


est infiniment petite ou infiniment grande. (Jordan [1882], p. 4.) 


Nous appelerons quantité infiniment petite ou infiniment petit toute quantité 
VARIABLE ayant pour limite zéro. 
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Nous appellerons quantité infinie tout quantité VARIABLE que l’on 
pourra prendre plus grande que toute quantité donnée. (Laurent [1885], 


p. 8.) 


Une quantité infiniment petite, ou un infiniment petit est une quantité variable 
qui a pour limite zéro. (Gilbert [1887], p. 41.) 


Man sagt, a, sei fur unendliches # unendlich klein oder infinitesimal, wenn 
es schließlich dem absoluten Betrage nach kleiner wird und bleibt als jede 
beliebig kleine positive Zahl. Anders ausgedruckt: Wenn jeder positiven 
Zahi e eine Zahl v entspricht derart, daß fur n > v immer ja,| < € ist, so sagt 
man, die Zahl a, sei infinitesimal oder sie konvergiert nach Null. (Cesaro 


[1904], p. 89.) 


On dit qu’un nombre variable x a pour limite un nombre fixe a, ou tend 
vers a, lorsque la valeur absolue de la différence x-a finit par devenir et 
rester plus petite que tout nombre positif donné à l’avance. Lorsque a = 0, 
le nombre x est dit un infiniment petit. (Goursat [1917], p. 1.) 


Une variable qui a pour limite zéro est une quantité infiniment petite ou 
infiniment petit. (De la Vallée-Poussin [1921], p. 17.) 


Many of these mathematicians follow Lacroix in asserting the auxiliary status 
of infinitely small quantities and stressing their essential variability. 


Une quantité infiniment petite ou un infiniment petit n’est donc pas une 
quantité determinée, qui ait une valeur actuelle assignable: c’est au con- 
traire une quantité essentiellement variable qui a pour limite zéro. (Sturm 


[1863], p. 6.) 


Ces locutions d’infiniment petit et d’infintment grand n’ont donc pas autre 
objet que l’abréviation du langue. (Serret [1869], p. 4.) 


... une quantité infiniment petite étant essentiellement variable, n’a pas de 
valeur fixe, et conséqement sa grandeur n’est liée en rien à nos appréciations 
physiques. L’essence d’un infiniment petit n’est pas d’être imperceptible 
mais de pouvoir décroître autant que l’on voudra. (Houél [1878], p. 106.) 


Man sagt also dadurch, dass man eine Zahl unendlich gross oder klein 
nennt, durchaus nichts über ihren augenblicklichen Werth aus, sondern 
kennzeichnet nur die Art der Veränderung ihres Werthes. (Worpitzky 
[1880], p. x.) 


Nous avons souligné à dessein le mot variable. Il n’y a pas de quantité fixe 
infini ou infiniment petite; zéro n'est pas infiniment petit, parce que zéro 
est fixe. (Laurent [1885], p. 8.) 


Mais il ne faut pas se laisser égarer pas ces denominations. Il n’y a pas, à 
proprement parler, d’infiniment petit, une quantité plus petite que toute 
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quantité donnée étant évidemment nulle. Quant à linfini, il échappe à 
toute mesure et ne saurait entrer dans un calcul. (Jordan [1882], p. 4.) 


Wie man die Bennenung Unendlichgro8 oder Unendlich für eine Zahl 
anwendet, die ihrem absoluten Betrage nach jede Grenze zu überschreiten 
strebt, so heißt jede verindliche Zahl, welche den Grenzwert Null hat, 
ein Unendlichkleines oder eine Infinitesimale. Der Begriff des Infinitesimalen 
impliciert also wie der des Unendlichen vor allem die Voraussetzung der 
Variabilität.” p. 487. “Zum klaren Verständnis der Infinitesimalrechnung 
ist es unbedingt notig, niemals außer acht zu lassen, daß die infinitesimalen 
Größen wie die unendlichen, wesentlich veränderlich sind,... (Cesaro 


[1904], p. 486.) 


4 INTERPRETATION OF INFINITELY SMALL QUANTITIES 


It seems clear that in the praxis founded by Cauchy, to give a variable « is to 
present its successive values 0, Xyp %,... 1.6. to prescribe a specific mode of 
variation. To say ‘Let « be an infinitely small quantity...’ is precisely to say: 
‘Let a = {ap}: be a sequence of reals such that lima,=o0,...’. A more 
general concept of variable was given by Méray in his [1894]. He was concerned 
with constructing real numbers from rationals by means of generalised Cauchy 
sequences, At an early stage in the theory he introduced infinitely small 
quantities and explicitly defined some of the concepts for working with sequences 
which were undefined but implicit in previous authors—such as a sequence 
finishing with a certain property. His definitions are worth quoting because they 
are closely connected with Robinson’s interpretation of Cauchy’s variables and 
because they are a starting point for the construction of non-standard analysis 
(a matter which will not be pursued further here). 


En Analyse infinitésimale on est obligé, presque & tout instant, de con- 
sidérer des quantités variant de manière à prendre successivement des 
valeurs numériques déterminées en nombre illimité. Nous les appellerons 
des vartantes. 

Chaque valeur d’une variante dépend habituellement des valeurs cor- 
respondantes de un ou plusieurs entiers positifs qui croissant sans cesse de 
l’une d'elles à la suivante et, par suite, indéfiniment: ce sont ses indices. 
Quelquefois ces nombres ne sont pas explicitement spécifiés; on prend 
alors pour indices des numéros d’ordre convenables attribués aux diverses 
valeurs de la variante. (Méray [1894], p. 23.) 


And (tbid.): 
On dit qu’une variante vy n- jouit de telle propriété determinée à partir 
des valeurs u, v,... de ses indices, quand la propriété en question lui 


appartient pour toute les combinaisons des valeurs des indices vérifiant 
simultanement les relations 


MEL, n Sh.. 
Thus the variable quantities are (rational) numbers indexed by k-tuples of 


positive integers (for some k), 1.e. are functions a:(Z+)} -> Q. Define an order 
relation > on (Z*)* by: (mi, ... m) > (m{,...m,) if m, > mj,...m, Z Mp 
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for m1,...,m,€ Zt. Then a variable quantity a, indexed by (2*)* finishes by 
having the property P if there exists an index « such that P(a,) for all indices 
B > a. 


Si, quelque petite qu’on puisse prendre la quantité positive €, on finit 
toujours par avoir numériquement vain... < €, variante vy,n.… 8€ nomme 
une quantité infiniment petite. (Op. cit., p. 24.) 

This is an extension of Cauchy’s notion of ‘infinitely small’. 


Une fonction donnee f(t, On; mn. - . -) de variantes ayant ou non des 
indices communs, est évidemment une nouvelle variante ayant pour 
indices ceux des entiers m;n, pim, 7,...;.... qui sont distincts les uns 
des autres. (Jdid.) 


Thus functions of variants are computed ‘component-wise’. This is implicit 
in all of the authors quoted above. 


On dit que la variante vmn, @ pour limite ou tend vers la quantite (invariable) 
V, quand la difference 


V— Vm,n.., 


est infiniment petite; ... En particulier, une variante infiniment petite tend 
vers zero. (Ibid., p. 30.) 


Méray was not, of course, attempting to interpret Cauchy: he was refining 
and working with concepts inherited from the Cauchy tradition. More than 
sixty years after Méray’s work Robinson ([1966], p. 270) suggested that Cauchy’s 
variable was ‘a function whose range is numerical while its domain may be any 
ordered set without last element’. Thus Robinson extended Méray’s definition 
of ‘variante’ by replacing the sets (Z*}* ordered by > with an arbitrary ordered 
set without last element: Robinson was probably unaware of Méray’s work. 


§ ORDERS OF INFINITELY SMALL QUANTITIES 


Cauchy developed an algebra of infinitely small quantities based on the notion 
of (relative) order: 


Soit « une quantité infiniment petite, c’est-à-dire, une variable dont la 
valeur numérique décroisse indéfiniment. ... En général, on appelle infini- 
ment petit du premier ordre toute quantité variable dont le rapport avec 
x converge, tandis que la valeur numérique de œ diminue vers une limite 
différente de zéro; infiniment petit du seconde ordre toute quantité 
variable avec g, et dont le rapport avec a? converge vers une limite finie 
différente de zéro, &c... ([182 ], p. 27.) 


Similar definitions were given by numerous mathematicians after Cauchy; we 
quote only two of them. 


Lorsque l’on considère simultanément plusieurs infiniment petits, on 
choisit arbitrairement l’un d’entre eux comme infintment petit principal: 
cela fait, on adopte les definitions suivantes. . .. On nomme en général 
infiniment petit de l’ordre n un infiniment petit dont le rapport a la puissance 
n de l'infiniment petit principal a une limite finie. (Bertrand [1864], p. 1.) 
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Zwei infinitesimale . Größen heißen von derselben Ordnung wenn ihr 
Verhältnis nach einem von Null verschiedenen Grenzwert konvergiert. 
Wenn dagegen das Verhältnis einer infinitesimalen Größe B zu einer 
andern « nach Null konvergiert, ..., so sagt man, B sei von höherer 
Ordnung als æ. (Cesaro [1904], p. 487.) 


The doubts cast by Fisher [1979] on Cauchy’s results on orders of the infinitely 
small are dispelled if one bears in mind that the principal infinitely small 
quantity (the given « in Cauchy’s statement) is a variable quantity converging 
to zero, t.e. lima, =o. Cauchy states that the general form of an infinitely 


n-+0 
small B of order n (with respect to «) is of the form k«"(1+e), where k  o and 
e is a ‘variable number which decreases indefinitely with the numerical value 
of a’ (Cauchy [1821], p. 28), but offers no proof. Fisher correctly saw that this 
statement is slightly erroneous: the general form should be Ax"(1-++e), where € 
is infinitely small. But he argued that the n-th order infinitely small is (a), 
where lim f(a)ja" = k Æ o, so that B = ka"(1-+e) where e = B/ka"—1; «is 


then infinitely small because lim e == o. The correct statement is given in 


a0 
Serret [1868], p. 5 from which it is clear that the infinitely smalls must be inter- 
preted as null sequences: 
Soient « l’infiniment petit principal, # un deuxième infiniment petit: on 
aura, par la nature de ces quantités, lim a = o, lim® = o. Cela posé, si le 
rapport @/« tend vers une limite À différente de zéro, de manière que l’on 
ait 


Lee k+e, 


e étant un infiniment petit, nous dirons que @ est un infiniment petit du 
premier ordre. La formule précédente donne 


€ = a(k-+6), 


et elle fournit ainsi l’expression générale des infiniment petits du premier 
ordre. 


Serret, like Cauchy, offers no proof—because it is too trivial. Thus, suppose 
ro k. Put en =€,/0,—k. Then Le En = 0 and @, = a, (k+ E: Hence 


= a(k- e) where e is infinitely small. 


pe s theorem x, p. 29 of his [1921] is also criticised by Fisher. This 
theorem states: 

Si l’on compare l’un à l’autre deux infiniment petits d’ordres différens, 
pendant que tous les deux convergeront vers la limite zéro, celui qui est 
de l’ordre le plus élevé finira par obtenir constamment la plus petite valeur 
numérique. 


Again, the mysteries created around this theorem disappear when one makes 
explicit the variable, sequential nature of the infinitely small. Cauchy’s proof is 
quite correct but is on the verge of triviality. Let 8 = ka®(1+-e), 8 = k'ax(r be 
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be infinitely small quantities of orders n, n’ respectively, with n’ > n. Then 
BR mn (Te) 


—_ = ue g aS ae 
B k (1+6) 
Obviously, lim f’/B = o. This concludes Cauchy’s proof. The trivial details 


of the argument are as follows. Since n’ > n and lim « =o we have lim a” ~" =o, 
r+ a + 


Hence 
_ pp Fk,  ,._,.(i+e) 
lim Z = — 3 = 
r+co Br  Rrso — (146) 
But then [f/| < |B,] for all sufficiently large r, 1.e. B’ finishes by being less in 
magnitude than £. 
The remaining theorems of the first section of the second chapter of Cours 
d'Analyse are equally trivial—and correct—when infinitely small quantities 
are understood as null sequences. 


Q. 





6 CONTINUITY 


Definitions of continuity in terms of infinitely small quantities were common 
in the Cauchy tradition, e.g.: 


... la fonction f(x) restera continue par rapport à x entre les limites 
données, si, entre ces limites, un accroissement infiniment petit de la 
variable produit toujours un accroissement infiniment petit de la fonction 
elle-même. (Cauchy [x921], P. 34.) 


Nous disons, comme l’on sait, qu’une fonction est continue, entre deux 
limites données d’une variable dont elle depend, ou dans le voisinage d’une 
valeur particulière attribuée a cette variable, lorsque entre ces limites, ou 
dans le voisinage de cette valeur particulière, la fonction conservant sans 
cesse une valeur unique et finie, varie de telle sort qu’un accroissement 
infiniment petit attribué à la variable, produise toujours un accroissement 
infiniment petit de la fonction elle-même. (Cauchy [1944], p. 17.) 


Une fonction f(x) de la variable x est dite continue pour les valeurs de x 
comprises entre deux limites x, et K, pour tout les valeurs de x, la valeur 
absolue de la différence 


(x-+-h)—f(x) 
décroit indéfiniment avec h, ou est infiniment petite en même temps que A. 
(Serret [1868], p. 15.) 


Une fonction f(x, y,...) de plusiers variables x, y, ... est dite continue 
dans le voisinage d'une système valeurs 
x= 4 y=), 


de ces variables, si l’accroissement f(a+5, b+e,...)—f{a, b,...) de la 
fonction, correspondant aux accroissements 6, e, ... donnés aux variables 
à partir des valeurs a, b,..., est infiniment petit toutes le fois que 8, e, ... 
sont tous infiniments petits. (Houél [1878], p. 121.) 
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- on peut dire qu’une fonction continue est une fonction que prend 
toujours un accroissement infiniment petit quand on donne à sa variable 
un accroissement infiniment petit quelconque. (Laurent [1885], p. 0.) 


It will be observed that in many applications of continuity Cauchy uses a 
‘sequential’ form of the definition, t.e. f is said to be continuous at a if lim x, = a 


a+ Ow 


implies Em n(n) = == f(a). This occurs, for instance, in his proof of the inter- 


mediate value theorem ([1821], pp. 460-2), an argument repeated by Houël 
[1878]. The transition from the infinitesimal form of definition to the sequential 
form might appear mysterious, but it is trivial when one recalls that infinitely 
small quantities are null sequences. Thus, let «, = *,—a. Then a = {x}: 
is infinitely small as lim x, = a. As f is continuous at a, f(a-+-«)—/f(a) is in- 
finitely small by Cauchy’s definition so that Jim (f(a) —f(a)) = o. Hence 


iim A f(n) == f(a). The sequential form of the definition of continuity is therefore 
mor a rewording of the infinitesimal form. 


We note that Jordan in his [1882], p. 11, gave an e—ô definition of continuity 
and then said: 


On exprime souvent cette idée d'une manière plus courte, mais moins 
précise, en disant qu'à tout système d’accroissements infiniment petits 
donnés aux variables correspond pour la fonction un accroissement 
infiniment petit. 


De la Vallée Poussin gave an e—& definition of continuity and proceeded to 
explain it in terms of infinitesimals. Goursat [1917], p. 12 attributed the modern 
definition of continuity to Cauchy and then gave an e—ô definition. It appears, 
then that the e-3 definition was seen as being equivalent to the infinitesimal 
form. For functions of several variables this equivalence is perhaps not com- 
pletely obvious. The two definitions can be expressed as: 


Di: fis C-continuous at the point (X, Y, ...) if for all infinitely small quantities 
a, B,...f(X+a, Y+f,...)—f(X, Y, ...) is infinitely small. 


D2: f is continuous at the point (X, Y,...) if for all e > o, there exists a ô such 
that [f(X+-x, Y-+y,...)—f(X, Y,...)| < whenever 22-924... < 8. 


It is easy to show that continuity implies C-continuity. T'o prove the converse, 
suppose that f is not continuous at (X, Y,...). By D2, there exists a positive 
number e such that for all positive integers # there exist real numbers dn, By, ... 
such that 


(x) oat Bat... < (1/n?) 

and 

(2) (X+ tw Y+ Ba one J-A, Y,.. )| 2 € 

Let a= {a}, B= {Ba} … By (1), «, B,... are infinitely small. By 
(2) f(X+a,, Y+-B,,...)—f(X, Y,...) does not converge to o as # — œ, 


i.e. f(X+a, Y+8,.. LE ae Là is not infinitely small, Hence C-continuity 
implies continuity. 


T 
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Cauchy is distinguished from his successors by his distinctive concept of 
‘neighbourhood’ implicit in some of his proofs. We have previously (Cleave 
‘[r971]) examined an allegedly mistaken theorem of Cauchy’s ([1921], p. 131) in 
which this concept was used. Another theorem of his, unencumbered by 
problems of convergence, but using implicitly the same notion of ‘neighbour~ 
hood’ occurs in the same work: 


Si les variables x, y, Zz... ont pour limites respectives les quantités fixes 
et déterminées X, Y, Z,..., et que la fonction f(x, y, z...) soit continue 
par rapport à chacune des variables x, y, 2... dans le voisinage du système 
des valeurs particulières 


RS AVS VRS Lycee 
(x, y, +...) aura pour limite f(X, Y, Z...). (P. 39.) 


This statement is false when ‘neighbourhood’ is understood in the modern 
sense, Thus let f: R? —> R be defined by: 


o ifx=o and y=o 


JG, y) = | wy á , 
AJP ifx~0 or y#o 
At each point (x, y) f is a continuous function of each variable separately, but 
f is not continuous at (o, o) (def. D2). 

Cauchy’s proof of the theorem consists of showing that for all infinitely small 
quantities a, B,y..., f(X-+a, Y+B, Z+y...)—f(X, Y,Z...) is itself in- 
finitely small. It proceeds as follows: 


f(X+0, Y+8, Z+y...)—f(X, Y, Z...) 
= {f(X+a, Y+, Z+y...)—-f(X, Y+8, Z+y...)} 
+{f(X, Y+8, Z+y..)—f(X, Y, Z+y...)} 
HAX, Y, Z+y..)-(X Y,Z...} 
+.. 


Since f is a continuous function of the first variable in the neighbourhood of 
(X, Y, Z...) it follows from the definition of continuity of functions of one 
variable that the first term {. . .} is infinitely small. Similarly as f is a continuous 
function of the second variable in the neighbourhood of (X, Y, Z...) it follows 
that the second term {...} is infinitely small, etc. Thus all of the terms {...} 
are infinitely small and so f(X+a, Y+8, Z+y...)—f(X, Y, Z...) is also 
infinitely small. That completes Cauchy’s proof. Evidently Cauchy counted 
the points (X, Y+B, Z-+y...), (X, Y, 2+y...) etc. as being in the neighbour- 
hood of (X, Y, Z...). Thus Cauchy assumed that the neighbourhood of a 
point contained all points infinitesimally close to it. Given this definition, there 
is nothing wrong with Cauchy’s theorem. 


7 THE EXTENDED CONTINUUM 


We have seen that infinitely small quantities enter into nineteenth century 
mathematics in the algebra of infinitesimals developed in the study of orders 
of the infinitely small, and in the Cauchy conception of ‘neighbourhood’. 
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Until late in the nineteenth century mathematicians did not give definitions 
of real numbers. Many began expositions of analysis with discussions of the 
notion of quantity and magnitude and followed with definitions of variable 
quantity, (e.g. Cauchy, Houél). Cauchy’s [1821] gives some preliminary remarks 
on quantities but concludes with long notes (pp. 403-60) which show that his 
positive and negative quantities form a real closed field, but an Archimedean 
axiom is not stated. Fisher [1979] remarks that Cauchy’s definition of variable 
quantity does not exclude infinitesimal values. However, it is clear from Cauchy’s 
actual usage, that he takes quantities to be reals. For instance, he proves (pp. 
104-6) that if g is a continuous function which satisfies 


(3) &(e-+y) = g(x) +e) 


then there exists a constant a such that g(x) = ax for all x. The essentials of 
Cauchy’s argument are firstly that (3) implies that g(g) = qg(1) for all rationals q, 
and secondly, that any quantity x is the limit of some sequence of rationals so 
that as g is continuous, g(x) = xg(1). It can be safely concluded that for 
Cauchy, and his followers, real quantity meant real number as we understand 
the term. 

Throughout the Cauchy tradition the notion of ‘infinitely small’ is mediated 
by the concept of ‘variable quantity’. Infinitesimals therefore subsist rather than 
exist. In this sense, one could claim that nineteenth century mathematicians, 
Cauchy included, did not use ‘actual’ infinitesimals, even though they developed 
an algebra of infinitesimals i.e. considered them ‘directly’. Yet there is this 
difference between Cauchy and his followers: Cauchy’s neighbourhoods are 
populated with subsistent entities infinitely close to the reals, the quantities that 
can be given and measured. 'The source of Cauchy’s ‘errors’ is exactly here: not 
in the liberal use of infinitely small quantities, which by his definitions are quite 
legitimate, but in his concept of ‘neighbourhood’, which is neither correct nor 
incorrect, but registers a decision to include infinitesimals in his continuum. 


JOHN P. CLEAVE 
University of Bristol 
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Review Article 


SUPPORT AND SURPRISE: L. J. COHEN’S VIEW OF 
INDUCTIVE PROBABILITY* 


In The Probable and the Provable, L. J. Cohen introduces a measure of what he 
calls ‘inductive probability’. His measure has the formal properties of measures 
of degrees of belief or confidence of acceptance derivable from G. L. S. Shackle’s 
measures of potential surprise. Shackle thought of his measures of potential 
surprise (or degrees of disbelief) and the correlated measures of degrees of belief 
as rivals to measures of uncertainty obeying the classical calculus of probabilities.1 
I have argued elsewhere that Shackle’s measures should not be construed as 
rival measures of uncertainty to probabilistic measures but as measures con- 
tributing to the explication of what Keynes called ‘weight of argument’ and 
Cohen has followed me in this.? 

This pleasant agreement with my proposed reconstruction of Shackle’s idea 
is enhanced by Cohen’s contribution to the discussion. Cohen contends that 
inductive probability (in his sense) has properties which render it useful in 
clarifying certain kinds of reasoning in Anglo-American law. I hope others will 
explore this important suggestion of Cohen’s more closely. 

I wish, however, to focus on some important differences between Cohen’s 
approach to applying Shackle-like measures to inductive reasoning and my own. 
Although Cohen recognises quite clearly that measures conforming to the 
calculus of probabilities and Shackle-like measures have complementary and 
not rival applications, he is quite anxious to establish his right to call Shackle-like 
measures a species of probability (which he calls ‘inductive probability’). He 
is also concerned to claim that measures of inductive probability are parasitic 
on measures of inductive support in the sense in which he introduced such 
measures in his earlier book, The Implications of Induction. These two objectives 
contribute to his restricting the domain of applicability of his measures of 
inductive probability to contexts of singular predictive inference. 
` My alternative reconstruction of Shackle-like measures applies to the contexts 
considered by Cohen and to other contexts as well. Moreover, it does so without 
appeal to Cohen’s concept of inductive support. Thus, the restrictions Cohen 
imposes on the domain of applicability of Shackle-like measures of degrees of 
confidence of acceptance (as I have called them) are needless unless one is 
concerned, as Cohen seems to be, with enhancing the importance of his con- 
ception of inductive support and with defending the right to call Shackle-like 
measures degrees of probability. I find Cohen’s efforts of doubtful validity in 
the first place and of little importance in the second. Thus, it seems to me that 
Cohen has gratuitously restricted the domain of applicability of his measures 
* Review of COHEN, L. J. [1977]: The Probable and the Provable. Oxford University 

Press. £9.50. Pp. ix+ 363. 
1 Shackle [1949], chapter VII and [1961], part II. 
t Levi [1966], [1967], chapters VIII and IX, and [1972]. 
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of inductive probability (or whatever one wishes to call them) and has thereby 
diminished the value of his contribution. 

In his Expectation in Economics, Shackle introduced the concept of potential 
surprise. Letting d,(g) be the potential surprise assigned g relative to K, the 
salient conditions satisfied by this measure are the following: 


(a) If d,(g) > o, dx(Tg) = o = minimum d-value. 

(6) If Kt ig, d,(g) = 1 = maximum d-value. 

(c) dx(g v f) = min(d,(g), dx(f)). 

The maximum and minimum values could have been chosen otherwise. 
They are, however, convenient. Observe that it is permitted that g and “Ig 
both bear minimum d-value. 

Shackle also called degrees of potential surprise degrees of disbelief. I have 
called them degrees of confidence of rejection. According to Shackle, to believe 
that h to some positive degree is to disbelieve its contradictory to that degree. 


brig) = dg( 1g)? Given this condition, b-functions should satisfy the following 
requirements: 
(a’) If belg) > 0, bgg) = o = minimum 6-value. 
(b) If K tg, b,(g) = 1 = maximum b-value. 
(c') brg & f) = mn(b,(g), bx(f). 
Notice that it is permitted that g and “lg bear minimum b-value (1.e., 0 
b-value). 
I have called measures satisfying these requirements measures of “degrees 
of confidence of acceptance’. Cohen’s measures of inductive probability satisfy 
these conditions. 
Shackle took his measure of potential surprise to be a rival to probability 
measures (in a sense in which such measures obey the calculus of probabilities) 
as a determinant of components of expectations to be used in evaluating rival 
options. The agent assigns to each hypothesis concerning the outcome of a 
feasible option a degree of potential surprise and a value representing his 
preferences. These two factors determine an expectation value for that hypo- 
thesis. The value of the option is a function of the best and the worst expectation 
values assigned that option. In effect, Shackle proposed a generalised criterion 
of the sort called the optimism-pessimism criterion and credited to L. Hurwicz 
by R. D. Luce and H. Raiffa in Games and Decisions. Moreover, he did so prior 
to Hurwicz and in a more general way.? 
This is not the place to explore the defects in Shackle’s decision theory. In 
my opinion, it is of doubtful merit. Nonetheless, it has also seemed to me that 
Shackle’s conception of potential surprise is of some philosophical interest. It 
captures a notion of degree of disbelief (or a cognate notion of degree of dis- 
confirmation) which is non-probabilistic and which conforms to some pre- 
systematic precedents. Moreover, these presystematic precedents are not idle 
1 Shackle [1949], p. 10. Shackle failed to state postulates for degrees of belief. They are 
easily read off his axioms for potential surprise. To my knowledge, I was the first to do 
this elementary exercise in print. 

* To my knowledge, the point is made in print for the first time in Ozga [1965] although 
it is subsequently acknowledged to be correct in Arrow and Hurwicz [1972]. 
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relics of a discredited ideology but manifestations of a concept which has a 
useful function in inquiry as an index of a factor contributing to what Keynes 
called ‘weight of argument’. Thus, I have contended for some time that Shackle’s 
measure of potential surprise is not a rival to measures of uncertainty con- 
forming to the calculus of probabilities but performs a function complementary 
to probability in deliberation and inquiry. Precisely the same observation applies 
to 6-functions representing degrees of confidence of acceptance. 

Suppose agent X begins with an initial stock of background knowledge and 
data K (representable by a deductively closed set of sentences or propositions). 
He is concerned to expand this corpus K by adding new information gratifying 
the demands occasioned by some question or problem. Let U be a set (for the 
sake of simplicity, I assume it is finite) of hypotheses Al, hê, .. ., h, exclusive 
and exhaustive relative to K and each consistent with K. U is such that a potential 
answer to the question under investigation, in so far as X has identified such 
answers, is representable as a case of adding a hypothesis g equivalent given K 
to a disjunction of some subset of the elements of U and forming the deductive 
closure K,. There are 2" potential answers including the two degenerate cases 
where g is the disjunction of all elements of U and K, = K and the case where 
g is the contradictory hypothesis (represented by the ‘null’ disjunction). Relative 
to K, the agent is constrained to endorse one of the potential answers. 

In attempting to evaluate which of rival potential answers generated by U 
to adopt relative to K, appeal is often made to some notion of support or con- 
firmation relative to K. How such appeal is made or should be made varies, 
Three distinct sorts of appeal are worth mentioning here: 

(îi) Sometimes risk of error is alleged to vary inversely to degrees of support 
and evaluations of risk of error are taken to be relevant to determining which 
potential answer to adopt. In my opinion, measures of credal probability con- 
forming to the requirements of the calculus of probabilities qualify as measures 
of degree of belief expressing judgments of risk of error and contributing to 
evaluations of expected utility. To be sure, credal states are not always represent- 
able by unique probability measures; but for present purposes, this qualification 
will be overlooked. Probability measures can be used as measures of degrees of 
support or confirmation when these measures are understood as determining 
the credal states agents should adopt relative to their knowledge K. In this 
sense, confirmation contributes to determining how appraisals of risk of error 
should be made. l 

(#) Sometimes the notion of support or confirmation is understood in a sense 
in which it measures a magnitude to be maximised in choosing between rival 
potential answers. We are urged to choose the potential answer with the greatest 
support or confirmation among all alternatives. 

Probability measures obeying the calculus of probabilities are ill suited for 
this purpose. Popper pointed out a long time ago that maximising probability 
promotes vacuous answers. He was right. If g is the disjunction of the elements 
of U, it represents the potential answer bearing maximum probability and should 
be chosen if probability is to be maximised. 

This does not show that probability is not useful as a measure of risk of error 
and as a determinant of expected value in decision making but only that it fails 
as a measure of support in sense (ti). 
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Popper proposed his own measure or family of measures of corroboration 
which he thought indicate what is to be maximised in choosing between rival 
answers to a given question. However, Popper recommended his measures for 
the purpose of determining which hypothesis is worthy of being tested further. 
I have proposed alternative measures of expected epistemic utility as factors 
to be maximised in choosing among rival potential answers for the purpose of 
adding information to the stock of background knowledge relative to which other 
hypotheses are to be tested. 

(a) Sometimes a notion of support is introduced such that a hypothesis is 
‘accepted’ if its support is high enough. Many authors think that probabilistic 
measures qualify as measures of support in this sense as well as in sense (t). 
However, as is well known, the set of sentences added in this manner to the 
initial corpus K fails to yield a deductively closed set. Hence, “acceptance rules’ 
of this sort cannot be used for the purpose of forming a new corpus of knowledge 
to be used as background in subsequent inquiry. 

On the other hand, b-functions satisfying conditions (a’), (b’) and (c’) can be 
used to represent degrees of warranted belief or support having the property 
that if b-values are high enough the hypotheses should be added to the evidence. 

Let b,({g) be higher than some threshold value. d,(1g) = b,(g) and, 
according to condition (c), d,{~1g) is equal to the smallest d-value assigned a 
disjunct of 1g when “Ig is represented as a disjunction of elements of a subset 
of U. Consequently, if hypotheses are rejected if their d-values are greater than 
the specified threshold, “lg is rejected as is every disjunction of a subset of 
disjuncts of “1g. But this is tantamount to holding that g is accepted as is every 
deductive consequence of K and g. Moreover, 1g and every such consequence 
has a b-value greater than the threshold. 

Thus, the problems about deductive closure which plague probabilistic 
measures when used as measures of support in sense (tti) do not trouble b- 
measures. 

When 5-measures are used for the purpose associated with (tñ), a hypothesis 
which has a high enough b-value is added to the evidence and inquiry as to its 
truth value is terminated. Similarly when the d-value is high enough, a hypo- 
thesis is rejected and its negation is added to the evidence. Once more, inquiry 
is terminated. The evidence or background knowledge K prior to termination 
is sufficient to warrant the termination. The ‘weight’ of the evidence is enough 
to render a decisive verdict concerning the hypothesis in question. It is in this 
sense, that b-functions (or d-functions) can be employed to evaluate weight of 
argument in the sense of Keynes. 

In The Probable and the Provable, Cohen introduces a measure 5,{g;e) of 
‘inductive probability’. When e is consistent with K, b,(g;e) = bg,(e) where 
the left hand side of this equation is a b-function satisfying conditions (a’), (b’) 
and (c’). Cohen explicitly recognises the use of his measures in type (##) criteria 
for acceptance and the relevance of his measures to the Keynesian notion of 
weight of argument. 

These are the main points of agreement between Cohen’s view and my 
own. 

By a deductively cogent inductive acceptance rule, I mean a rule which prescribes 
how one should expand a corpus K by adding items equivalent given K to 
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disjunctions of subsets of elements of U and which secures that the result is 
consistent and deductively closed. 

I have just shown in passing how by specifying a threshold value & for the 
b-function, one can generate such a deductively cogent acceptance rule. One 
way to do this is through appeal to d-values. The idea is that all elements of U 
whose d-values are greater than À are rejected and all others unrejected. Let g 
be equivalent given K to the disjunction of all unrejected elements of K. The 
deductively cogent acceptance rule is to accept g and all deductive consequences 
of K and g and thereby form the corpus K,. 

Observe that for each value of k from the minimum value of o to the maximum 
value (which we have set at 1), we can construct a deductively cogent acceptance 
rule. When À = 0, all elements of U with positive d-values are rejected. When 
k = 1, no element of U is rejected. For some intermediate value of k, all and 
only elements of U for which d,{g) > k are rejected. Thus, given a method for 
constructing d-functions for diverse potential corpora of knowledge, we may 
construct a caution dependent family of deductively cogent acceptance rules 
characterised by a caution parameter k. 

Observe, however, that caution dependent families of deductively cogent 
acceptance rules can be constructed in other ways as well. Two examples will 
serve to illustrate the point: 

R,: If U consists of a set of simple statistical hypotheses and Æ contains 
data e reporting the outcome of an experiment covered by these hypotheses so 
that, for each h, & U, the likelihood L(h,;e) is defined, let L* be the maximum : 
likelihood assigned an element of U relative to e. Reject an element of U if and 
only if L{h,;e)/L* Z q. 

R,: If U consists of n hypotheses and O(h,) is the probability of A, = U 
relative to K, reject h; if and only if O(h,) Z q/n. 

The first rule is a likelihood rejection rule recommended by Hacking.! The 
second is a rejection rule I advanced in Gambling with Truth.? I would now 
propose a criterion with greater applicability than either of these criteria. On 
the understanding that one is to add to K the disjunction of unrejected elements 
of U and all deductive consequences, both of these rules are deductively cogent 
and belong to a caution dependent family characterised by k = 1—~g. 

Observe, however, that neither of these rules (more strictly, families of rules) 
is characterised by reference to b-functions or d-functions. Consequently, one 
may start with some family of rules of one of these kinds or of a more sophisti- 
cated variety and define b-functions and d-functions in terms of them rather 
than the other way around. 

Given a caution dependent family of deductively cogent rules, let ¢,(g) be 
the value of g such that g is rejected relative to K for all values of q greater than 
qx(g) and is unrejected otherwise. If g is unrejected for all values of g, qg(g) = I. 
If it is rejected for all values of q, 


Qx(g) = 0. di(g) = Rx(g) = 1—Qqx(8) bg) = def 18) = 1-9 x 18)- 
In my previous work on the subject, I proposed reconstructing Shackle 
measures by appealing to independently specified caution dependent families of 
deductively cogent acceptance rules. 


1 Hacking [1965], pp. 89-01. * Levi [1967]. 
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Thus, in specifying a threshold such that g is to be added into evidence tf its 
b-value relative to K is higher than that threshold, one is also stipulating how 
cautious (or how bold) one should be before adding new information to the 
background knowledge. 

Cohen nowhere in his book takes note of this connection between his measures 
of inductive probability and caution dependent families of acceptance rules. 
Yet, it is clear, given what he does say about his measures, that he is com- 
mitted to acknowledging the existence of caution dependent families of accep- 
tance rules characterised in terms of different specifications of thresholds for 
acceptance in terms of his inductive probability measures. What is lacking 1s an 
acknowledgment of the relevance of caution dependent families characterised 
without reference to such measures which can then be used to define inductive 
probability measures (1.e., b-functions). 

Cohen actually restricts the domain of applicability of his measure of inductive 
probability to a narrowly circumscribed class of cases. He is concerned with 
situations where agent X knows that an event of kind R is about to occur and, 
in addition, has other background knowledge K. X is concerned to ascertain 
which of several different possible results will ensue the occurrence of R and, 
in particular, to assign to the hypothesis that a result of kind S will occur upon 
the occurrence of an R relative to background K a degree of conditional inductive 
probability 5,(S;R). If K, is the result of adding to K the information that the 
event of kind R has taken place and forming the deductive closure, 
bx,(S) = 6,(.S;R) has the formal properties characterised by (a’)-(c’). 

According to Cohen, b,{(S;R) should equal the degree of inductive support 
assigned the hypothesis ‘(x)(Rx > Sx)’ by the background knowledge K where 
inductive support is understood in the sense of Cohen’s The Implications of 
Induction. By proceeding in this fashion, Cohen establishes a link between his 
notion of inductive support and his notion of inductive probability and, at the 
same time, relates inductive probability to the concept of provability. 

According to Cohen, an inference from information that an event of kind R 
has occurred to the conclusion that an event of kind S has occurred is licensed 
by some general rule of inference. If the rule of inference is invariably truth 
value preserving, ‘(x)(Rx > Sx)’ is true. Perhaps, however, we are not in a 
position to claim that the universal generalisation is true but only that it is 
supported to some degree. ‘Then the associated rule of inference furnishes only 
a certain grade of provability or, as Cohen suggests, grade of probability. I will 
not attempt to rehearse the various senses of probability Cohen claims to tease 
out of this root idea. I find the discussion of this matter which takes up the first 
48 pages of Cohen’s book extremely contrived and unconvincing. It is designed 
to establish Cohen’s right to call his measure a species of probability. For my 
part, I would gladly grant him that right without such argument provided that 
he carefully distinguishes it from other notions of probability in use. 

In any case, because Cohen seeks to understand inductive probability as based 
on rules of inference supported by universal generalisations which receive 
measures of inductive support, he restricts the domain of applicability of 
measures of inductive probability to inferences from instances of the antecedents 
of universal generalisations to instances of consequences. 

According to Cohen, a universal generalisation of the form ‘(x)(Rx = Sx)’ 
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receives a positive degree of inductive support only if tests of such universal 
generalisations have been conducted which belong to an appropriately constituted 
hierarchy of tests ordered with respect to severity and the generalisation has 
passed some initial segment of the sequence. If either no tests have been con- 
ducted or the generalisation has failed the first test in the sequence, the general- 
isation receives o inductive support. If it is known that the generalisation has 
passed at least the first : of the n tests in the sequence, it is assigned at least 1/n 
degrees of inductive support. If it is known that it has passed the first 7 tests 
but failed the (¢+1)-th test, then regardless of how many tests more severe 
than the (¢-++1)-th it passes, its grade of inductive support is exactly ¢/n. 

The hierarchy of tests is determined by background knowledge K which, in 
the first instance, determines what kinds of factors are relevant ‘variables’ or 
factors controlling when events of kind S follow events of kind R. Moreover, 
these variables are arranged in a sequence representing the degree of relevance. 
The first such variable v specifies whether R occurs or not. That is to say, its 
‘variants’ are the occurrence and non-occurrence of R. The second variable v, 
has a ‘normal’ variant N, and one or more abnormal variants Fag Vg, <- -s Vamo 
Similarly, the third variable ©, has a normal variant N, and one or more abnormal 
variants Vas, Vas : .., Vamse The procedure reiterates through the n variables 
judged to be relevant and ordered in decreasing order of relevance. 

These n variables are considered relevant to the generalisation ‘(x)(Rx > Sx)’ 
in the sense that, given the background K, if an event of kind R occurs, whether 
a result of kind S occurs or not depends entirely on which variants of the n 
variables obtain and on no other circumstances. Given a combination of circum- 
stances specified by identifying the variants of all the relevant variables, if R 
occurs under this combination of circumstances, either S invariably follows or S 
invariably fails to follow. 

Obviously more needs to be said about relevant variables and the extent to 
which Cohen’s account is faithful to scientific experimentaticn as he claims it to 
be. Nonetheless, Cohen is quite right that when an investigator designs an 
experiment, he should take care that it institutes controls for variants of all the 
relevant variables in some sense or other of relevance and that judgments as to 
which variables are relevant belong to the ‘background knowledge’ and may, 
in the course of inquiry, be subjected to revision. 

What is rather more doubtful is whether ‘Baconian’ method requires a 
ranking of the relevant variables with respect to what Cohen calls ‘falsificatory’ 
potential. Cohen explains that this ranking is also based on background informa- 
tion culled from considering the testing of generalisations belonging to the same 
category as the generalisation under scrutiny. A combination of circumstances 
characterised by specifying variants (abnormal or normal) for all the variables 
which are relevant falsifies ‘(x)(Rx > Sx)’ when R occurs in the presence of such 
a combination and S fails to occur. Such a combination of circumstances will 
have normal variants for a subset of the relevant variables and abnormal variants 
for the rest. 

The variable v has the highest falsificatory potential and, whatever this 


` 1 What I call an ‘abnormal variant’, Cohen calls a ‘variant’. When no variant (i.e., 
abnormal variant) is present, Cohen says that the situation is normal with respect to the 
variable (p. 136). I use ‘variant’ to cover both normal and abnormal cases, 
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means, it implies that the generalisation could be falsified by a combination of 
circumstances where an abnormal variant of v, is present and the variants of all 
variables for i> 2 are normal. This does not mean that the generalisation will 
be falsified for such a combination of circumstances. One obscurity in Cohen’s 
scheme is the force of the ‘could’ in this judgment. 

v has less falsificatory potential than v,. It is not possible, given the back- 
ground knowledge, for a combination of circumstances to falsify the general- 
isation where v, has an abnormal variant but the variants of all other variables 
including va are normal—unless the generalisation is also falsified in the case 
where v, and v, have abnormal variants but the variants of other variables are 
normal. Similar remarks apply mutatis mutandi to other variables in the 
series.i 

This ordering of relevant variables from vg onwards with respect to falsifi- 
catory potential also induces an ordering of relevant combinations of circum- 
stances with respect to falsificatory potential The combination with greatest 
potential is that in which all variants are normal. Consider any combination 
where v, has an abnormal variant and all other variants are normal. Such a 
combination of circumstances belongs at the next lowest level of falsificatory 
potential. Any combination where the variants of v, and 9, are abnormal and 
the remaining variants are normal is at the next lowest level and so on. 

If there is a combination where the variant of v, is abnormal and the other 
variants, including that for v, are normal, the falsificatory potential of that 
combination can be no different than that for a combination where v is abnormal 
and as well as wv. 

These remarks explain to some degree some of the formal constraints on the 
ordering of relevant variables with respect to falsificatory potential and the cor- 
relative ordering of combinations of relevant circumstances. But it still does not 
explain the import of such a ranking. 

However, I think Cohen means to say that if one is about to test the merits 
of ‘(x)(Rs > Sx)’, it is to be expected given the background knowledge that, if 
the generalisation is falsifiable at all, it will be falsified by a combination of 
circumstances bearing high falsificatory potential rather than by one bearing 


1 Cohen fails to state this assumption explicitly; but it is crucial to the situations he 
considers typical of experimental science (1.e., excluding the allegedly remote cases he 
discusses on pp. 144--5). Let the generalisation ‘(x)(Rx D Sx)’ pass the first three tests 
in a Cohen hierarchy. R has been realised in the presence of normal variants for all 
relevant vs where 1 > 1. R has been realised in the presence of all possible cases of 
abnormal variants of v} where the variants for all other vrs are normal. R has been 
realised in the presence of all possible combinations of abnormal variants of va and v, 
where variants for all vps where f > 3 are normal. For all these cases, the generalisation 
passes, On p. 185, Cohen clearly implies that in a case such as this, the data fully support 
‘Every R unaccompanied by an abnormal variant of a variable for i > 3 is S’. I take it 
that Cohen means to preclude falsification by a situation where the only abnormal 
variant (other than the presence of an R) is the occurrence of an abnormal variant of vy. 
But this is exactly the assumption I attribute to Cohen. 

? Cohen does not call this induced ranking of combinations of circumstances a ranking 
with respect to ‘falsificatory potential’. Only ranking of relevant variables is alleged 
to be with respect to falsificatory potential. I think my extension of his terminology 
suggestive and fair; but nothing much hinges on it. What is crucial is that his ranking 
of relevant variables and tests generates a ranking of combinations of circumstances. 
The significance of the ranking will emerge subsequently. 
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low falsificatory potential. What is not clear is whether Cohen is claiming that 
the probability of falsification (in a sense conforming the classical calculus of 
probabilities) conditional on realising a combination with high falsificatory 
potential is greater than the probability of falsification conditional on realising 
a combination with low falsificatory potential or whether the comparison involves 
some other mode of appraising what is to be expected. I suspect that Cohen 
does mean probability in some sense conforming to the requirements of the 
classical calculus; for the evidence for assessments of falsificatory potential 
comes from records of the frequencies with which other generalisations belonging 
to the category to which the generalisation under investigation belongs have 
been falsified by various combinations of circumstances. 

In any case, according to Cohen, in testing ‘(x)(Rx = Sx) highest priority 
should be given to controlling for variables with high falstficatory potential. 
Given that the generalisation passes tests controlling for variables of this sort, 
it may then be desirable to test for variables of lesser relevance—t.e., falsificatory 
potential—as well. 

I can well understand why, under circumstances where variables can be 
ranked with respect to falsificatory potential in a manner conforming to the 
conditions described above, one might find a use for such a ranking in a situation 
where experimentation incurs costs and one is not able to test for all relevant 
variables. A priority among relevant variables would then be desirable so that 
one might decide which of the relevant variables to test for. 

But although such a ranking would be nice to have when cost of experimenta- 
tion becomes critical, it is not absolutely necessary. One can still control for 
relevant variables subject to cost constraints even if one cannot rely on a ranking 
with respect to falsificatory potential to help choose between the variables to 
be subjected to control. In this sense, ranking with respect to relevance or 
falsificatory potential is not essential to methods of controlled experimentation. 
This point is important; for Cohen tends to insist (for reasons obscure to me) 
that controlled experimentation requires such a ranking and that those who do 
not understand this have no sympathy at all for the Baconian tradition. 

The point is strengthened by consideration of cases where the agent is able to 
control for all relevant variables. In such contexts, it is utterly unclear to me 
what importance the ranking with respect to falsificatory potential has. Its 
role in providing a way of selecting variables to control disappears. 

Needless to say, Cohen would disagree. Suppose that ‘(x)(Rx = Sx)’ passes 
a test which controls for no relevant variables so that all variants are normal 
ones. (This is what Cohen means by failing to control for relevant variables. 
There is another, more natural reading, where failing to control for relevant 
variables means that no steps are taken to determine whether the variants of any 
of the relevant variables are normal or abnormal. Cohen cannot use this more 
natural reading without falling into a difficulty he discusses, in one form, on 
pp. 135 ff. He solves the difficulty by using a notion of control and lack of control 
which does not in any obvious sense fit scientific practice. But I let this go.) 
The generalisation has been tested for circumstances bearing highest falsificatory 
potential and has passed. But it would be tested more severely if it passed a test 
for all possible circumstances at the next level down of falsificatory potential. 
Although passing the first test reveals the generalisation’s ‘ability to resist 
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falsification’ (p. 140) to some degree, the ability is shown to be greater yet by its 
passing the test of greater severity. Needless to say, tests of greater severity 
yet involve controlling for still more variables. 

Suppose that ‘(x)(Rx = Sx)’ has been subjected to all tests of increasing 
severity in the hierarchy thus generated by the ranking of relevant variables 
(and combinations of circumstances) with respect to falsificatory potential. 
Suppose further that it has passed all tests through the z-th level and has failed 
the (i-+-1)-th test. According to Cohen, the generalisation bears positive support 
even though it is known to be false on the basis of the data and background 
knowledge. What this means is that the generalisation has a degree of ability 
to resist falsification equal to the ordinal number z/n. 

Thus, grading hypotheses with respect to inductive support in Cohen’s 
sense is like grading members of the Civil Service. The grade in the Civil 
Service depends on tests one has passed in a given series. Even if one has failed 
the (i-+-1)-th test, if all í tests in the initial segment have been passed, one has the 
‘merit’ to remain in the 1-th category in the Civil Service. Similarly even if the 
evidence contradicts a generalisation and the agent knows the generalisation to 
be false, he may and should regard the generalisation as having the ‘merit’ 
indexed by a positive degree of inductive support. 

Clearly Cohen’s notion of inductive support belongs in none of the three 
categories I discussed earlier. It does not index risk of error in accepting the 
generalisation. The risk of error could be maximal because the generalisation 
is falsified by the evidence and yet the generalisation could be highly supported. 
In choosing among rival potential answers for the purpose of augmenting 
evidence, it would be foolhardy to maximise support in Cohen’s sense if this 
leads to choosing an alternative known to be false. For a similar reason, we 
should avoid accepting a hypothesis if its inductive support is high enough. 

To conclude from this that Cohen’s notion of support is of no value would be 
wrong. It is fair to ask, however, for an explanation of the use of Cohen’s 
measures in deliberation and inquiry. 

A possible use is as a summary of information concerning the results of 
batteries of tests meeting the rather special conditions Cohen imposes on such 
tests. 

Observe, however, that the sort of summary provided by reports of inductive 
support is highly selective of the information summarised. Why and for what 
purpose should we retain the information conveyed by a report of inductive 
support rather than other information about the results of testing? 

In the Probable and the Provable, Cohen does furnish one possible use of his 
measure of inductive support—to wit, as a determinant of evaluations of 
inductive probability—+.e., of b-values or degrees of confidence of acceptance 
with the intended applications Cohen and I agree they should have. 

Recall that if a generalisation is highly supported, it has resisted falsification 
under all relevant circumstances except for those whose falsificatory potential 
is minimal. Cohen explicitly states that a generalisation with an ‘ability to resist 
falsification in certain combinations of circumstances’ is ‘reliable’. In this 
sense, a generalisation bearing high inductive support is thereby reliable even 
if it is false and, indeed, is certainly false. 

Suppose that agent X knows that an event of kind R is about to occur and is 
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interested in whether an event of kind S will ensue. He knows which variables 
are relevant for S in the light of R and can order them with respect to falsificatory 
potential. He also knows that a complete battery of tests has been run and knows 
the results. (Cohen envisages other cases as well; but they are more complicated 
and cannot be discussed in the short space of this review.) Consequently, X 
knows for each possible combination of relevant circumstances whether R’s are 
invariably followed by S’s or not. What X does not know is which possible 
combination of relevant circumstances is present. Thus, he does not know 
whether all variants of all relevant variables are normal or, if some are abnormal, 
which are. 

Under special circumstances like these, Cohen claims that the agent X should 
assign a degree of inductive probability (1.e., a degree of confidence of acceptance 
or b-value) equal to the degree of inductive support assigned to the universal 
generalisation ‘(x)(Rx = Sx)’. This means that if the inductive support for the 
generalisation is very high but is not a maximum (so that the generalisation is 
false and is known to be false), the agent X would still be justified in assigning 
to the hypothesis that an S will occur a degree of confidence of acceptance 
sufficiently high to warrant his adding that hypothesis to the evidence and 
ceasing inquiry into the matter. 

In this manner, Cohen’s measure of inductive support is given a use as a 
determinant of inductive probability which, in turn has a use as a mode of 
appraising weight of argument. 

According to Cohen, the reason we are entitled to link inductive probabilities 
of inferences from R’s to S’s with the inductive support for ‘(x)(Rx = Sx) in 
the manner indicated is that inductive support is an index of the reliability of 
the generalisation—+.e., its ability to resist falsification. 

In some sense of ‘reliable’, it is undoubtedly true that the reliability of a 
generalisation controls the degree of confidence with which one may legitimately 
infer the occurrence of an S from information that an R has or is about to occur. 
However, in the sense in which Cohen takes a generalisation to be reliable, this 
link between reliability and degrees of confidence does not hold. 

Let the generalisation be supported to the (s—1)/n degree. This means that 
the generalisation survives falsification under all relevant combinations of 
circumstances except some combination or combinations bearing minimum 
falsificatory potential. Moreover, even though those combinations of circum- 
stances bear minimal falsificatory potential, they do falsify the generalisation 
and X knows them to be not-.S inducing in the presence of R. 

If X knew such a not-S inducing combination of circumstances to be present, 
he should be certain that an S will not occur and if he knew that the combination 
did not obtain, he should be certain that an S will occur. But we suppose that 
X is in suspense. His degree of confidence of acceptance in the hypothesis that 
an S will occur should be equal to the degree of confidence of rejection (degree 
of potential surprise) he assigns the hypothesis that an S will not occur. Given 
the background, the d-value assigned the hypothesis that an S will not occur is 
equal to the d-value (degree of confidence of rejection or potential surprise) 
assigned the disjunction of hypotheses asserting that a combination of circum- 
stances which is not-.S inducing obtains; and this d-value is equal to the d-value 
assigned the disjunct bearing a minimum d-value. 
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All of this follows strictly from the theory of inductive probability (degrees 
of confidence of acceptance or b-values) endorsed by Cohen when its scope of 
applicability is extended from hypotheses about the results of the occurrence 
of an R to hypotheses about the unknown combination of relevant circumstances. 
Surely given his Baconian commitment Cohen will agree that X’s judgments 
concerning the occurrence of an S depend critically on his judgments concerning 
the combination of relevant circumstances which obtains. 

If, in our example, X is to assign a high inductive probability (b-value) to 
the hypothesis that an S will occur because the inductive support for the general- 
isation is so high, he must be obliged to assign a high d-value (potential surprise) 
to the hypothesis that not-S inducing circumstances are present because these 
circumstances have low falsificatory potential. More generally the ranking of 
combinations of circumstances with respect to falsificatory potential should 
tend in the opposite direction from the ranking of these combinations with 
respect to d-value or potential surprise. 

However, Cohen’s characterisation of falsificatory potential furnishes no 
basis for assuming that if combination of circumstances a bears greater falsifi- 
catory potential than combination b, d(b) > d{a). And, even if we grant Cohen 
his view of the importance of rankings with respect to falsificatory potential, 
no argument for this assumption emerges. [ conjecture that Cohen has confused 
two senses of normality or abnormality. In one sense, circumstances which are 
normally present are to be expected to obtain more than abnormal circumstances. 
But in Cohen’s theory, normal circumstances are those bearing maximum 
falsificatory potential regardless of how likely it is that they obtain (and this is 
so whether ‘likelihood’ means ‘probability’ in a sense conforming to the calculus 
of probabilities or ‘confidence of acceptance’ or b-value). I suspect that Cohen 
has moved illicitly from the second sense to the first in establishing his link 
between inductive probability and inductive support. 

Cohen could restrict the scope of applicability of his account of inductive 
probability to cases where the assumption does hold; but then the domain of 
applicability of his account of inductive probability will be narrow indeed. 
Moreover, the restriction is gratuitous. As long as some appropriate caution 
dependent family of deductively cogent acceptance rules is available for use in 
assigning d-values and b-values to hypotheses about the combination of relevant 
circumstances, we do not need to rely on reference to inductive support to 
assign inductive probabilities to inferences from R’s to S’s. Even in those cases 
where inductive support can be correlated with inductive probability, the 
relevant factors for determining 5-values—t.e., inductive probability—depend 
on reference to the caution dependent family of deductively cogent acceptance 
rules being employed. We do not, even there, need to allude to inductive support. 

Thus, appeal to inductive support is neither necessary nor at all useful in 
making assessments of inductive probability in the sense which Cohen and I 
agree is of some importance in inquiry and deliberation. In saying this, I mean 
no disrespect to the ‘Baconian’ method of controlling for relevant variables— 
although I remain unconverted to the importance of rankings of variables with 
respect to falsificatory potential except for the purpose of deciding which relevant 
variables to control when one cannot control all of them. And even acknow- 
ledging this marginal value for rankings of variables with respect to falsificatory 
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potential, I remain utterly in the dark as to the usefulness of measures of 
inductive support. 

In any case, the scope of applicability of caution dependent families of 
deductively cogent acceptance rules extends well beyond contexts of inference 
from knowledge of events of kind R to events of kind S. An account of b-values 
and d-values (of Cohen’s inductive probabilities) can be applied in contexts 
where the task is to estimate the value of an unknown statistical parameter or 
to choose one of rival theoretical hypotheses. 

To be sure, one has to countenance some sort of inductive acceptance or 
rejection rules; but Cohen does not object to doing so. It is also clear that we 
need some account of criteria for inductive acceptance or rejection which 
distinguishes adequate caution dependent families from inadequate ones. 
Granted that such an account remains to be completed (in my view substantial 
progress has been made), it is demonstrable that as long as adequate families 
must satisfy requirements of deductive cogency, d-functions and 6-functions 
will behave like Shackle measures and that they should be interpretable as indices 
of weight of argument in the sense explained earlier on. We do not, therefore, 
have to claim to have settled all problems to claim to have settled some of them. 

It seems to me that Cohen has hobbled his account of inductive probability 
understood as having application to assessments of weight of argument and to 
determining thresholds of acceptance into evidence by restricting its domain 
of applicability in order to attempt (unsuccessfully I think) to give a use > to his 
measures of inductive support. 

The theory of inductive probability or degrees of confidence of acceptance 
has a life of its own independent of Cohen’s theory of inductive support. It is a 
good thing this is so; for the Shackle-like measures do characterise interesting 
features of evidence and hypotheses in deliberation and inquiry. Cohen has 
succeeded in identifying many of these features; and he has broken new ground 
by exploring applications in legal contexts which, I regret to say, I am incom- 
petent to assess but which, nonetheless, seem to me to be both interesting and 
important subjects for critical scrutiny. At the same time, Cohen has failed to 
identify any scientific context where it is desirable to make appraisals of hypo- 
theses with respect to inductive support in his sense. To the extent that Cohen 
has attempted in his recent book to lend respectability to inductive support 
by using it in his explication of inductive probability, he has, so I believe, 
damaged his case for inductive probability. 

In sum, Cohen’s insight serves him better than his ideology. Discriminating 
readers will find much to treasure in his book if they dig hard enough; and the 
effort will be well worth their while. 


ISAAC LEVI 
Columbia University 
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Reviews 


SIMON, Heruert A. [1977]: Models of Discovery, and Other Topics in the Methods 
of Science. Dordrecht, Holland: D. Reidel. $39.50. Pp. xx+456. 


This book republishes twenty-three papers by the author, and three that were 
co-authored, on a wide range of topics in the philosophy and methodology of 
science. The papers were originally published at a variety of dates during the 
past thirty-five years. But they are here collected into six different groups and 
an attempt has been made to knit them together by the addition of a general 
introduction at the outset and of separate introductions to each group of papers. 
The first group of papers is concerned with statistical tests and confirmation- 
theory, the second with the concept of causality in experimental science, the 
third with the logic of decision-making, the fourth with the structure of complex 
hierarchies, the fifth with the nature of scientific discovery, and the sixth with 
the formalisation of scientific theories. ‘The author exhibits an impressively wide 
acquaintance with the literature and problems of cognitive psychology, artificial 
intelligence, decision-theory, automata-theory, formal logic and classical 
mechanics. He has something interesting to say in almost all his papers; and the 
book serves a very useful purpose in bringing together ideas or formulations in 
different fields that are mutually connected in various illuminating ways. 

It is quite impossible to do justice to so many papers in a short review. I shall 
therefore make one general comment and then concentrate on what I take to 
be the most important issue raised by Simon’s work on the methodology of 
science. 

The general point is this. Simon himself cautions against making arguments 
and expositions any more formal or mathematical than they need to be. But 
even when excessive formalism is avoided there is still the risk that attention to 
technical details may somehow inhibit the flow of philosophical reflection and 
obscure a failure to come to grips with the real problems. In such cases it might 
be best not to reprint a paper in its original form. For example, Simon rejects the 
view that the asymmetry of cause and effect should be defined in terms of tem- 
poral sequence. He therefore seeks to define causation (pp. 81-91) in terms of a 
formal-logical relation that induces a partial ordering among the members of a 
set of atomic empirical sentences in a first-order, truth-functional language. 
The question that might immediately spring to a philosopher’s mind is: how 
can we be sure that there may not be other concepts, such as ‘is caused by’, or 
‘is followed by’, that have the same formal-logical structure? Simon seems never 
to consider this issue. Again, it is often supposed that the concept of causation 
in experimental science carries with it the implication that, other things being 
equal, mutually similar causes have mutually similar effects: after all, that is 
why experiments are supposed to be replicable. So, if an analysis of the concept 
of causation in experimental science does not carry that implication its author 
ought at least to explain why he has omitted it. But Simon makes the omission 
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without giving us any explanation. Perhaps part of the reason for this is his 
rather Pickwickian choice of terminology. Thus an emprrical law is defined as 
‘a molecular sentence formed from one or more of the n atomic sentences’ 
(p. 82). So, to take an example like one of Simon’s, ‘A lighted match was ob- 
served in my room at noon on July 12, 1978’ turns out to be an empirical law! 
It is not clear that this kind of paper deserves reprinting 1n its original form. 

Simon has made contributions to the philosophy of science that are more 
worthwhile reprinting in fields where his own special expertise is relevant. This 
is particularly in evidence where he argues for a theory of scientific discovery 
on the basis of recent results in artificial intelligence. The study of artificial 
intelligence has supplied us with some quite determinate information about the 
possibility of discovering solutions for certain sorts of problems in certain sorts 
of ways. It is obviously reasonable to enquire whether this information can 
serve to dispel any of the ‘dense mists of romanticism and downright know- 
nothingism’ (p. 266) that tend to surround the subject of creativity in general 
and scientific discovery in particular. And Simon in fact makes substantial 
claims here both in regard to the description and explanation of the thought- 
processes that have actually been involved up to now in scientific discovery and 
also in regard to the norms to which, optimally, such thought-processes should 
conform. 

Roughly, Simon’s view is that scientific discovery, like theorem-proving, 
chess-playing, etc. in artificial intelligence, operates via a highly selective trial- 
and-error search of solution possibilities. A good chess-playing programme uses 
appropriate heuristic strategies to sidestep the combinatorial explosion that is 
entailed by the exhaustive exploration and evaluation of possible moves, 
counter-moves, counter-counter-moves, etc. A good scientist, says Sumon, does 
the same. No doubt it is easier to understand how this can happen when a space 
of possible solutions is determined by some accepted framework or dominant 
paradigm. But Simon claims that ‘there are no qualitative differences between 
the processes of revolutionary science and of normal science, between work of high 
creativity and journeyman work’. On his view this is because both the creation 
of new problems and the creation of new representations are incidental to the 
operation of any sufficiently sophisticated, and sufficiently exercised, problem- 
solving system. Such a system generates new problems for itself by the discovery 
of anomalies—facts that cannot be reconciled with existing solutions. Similarly 
new representations are never completely novel but arise by the modification 
and development of previous representations. Just as the experience and 
expertise of chess-masters supplies the computer scientist with useful heuristic 
strategies to incorporate into his chess-playing programme, so the experience 
and expertise of leading scientists help others to search in the right directions 
for appropriate developments of their problems and representations. Simon is 
of course aware that great discoveries in-science are rather rare, and turns that 
fact to advantage in support of his own theory. He argues from it that a system 
which is to explain human problem-solving and scientific discovery does not 
need to incorporate a highly powerful mechanism for inventing completely 
novel representations. ‘If it did contain such a mechanism,’ he says, ‘it would be 
a poor theory, for it would predict far more novelty than occurs’ (p. 302). Indeed, 
real discoveries are so rare that a theory to explain them ‘must predict innumer- 
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able failures for every success’ (p. 288). Nor, according to Simon, need there 
be anything especially puzzling about the fact that unconscious incubation of a 
problem is so often reported to have prepared the way for awareness of its 
solution. Simon suggests that the difference between short-term and long-term 
memory is critical here. Suppose that a tree of possible solutions is being 
searched, and that while the relatively earlier and more important nodes are 
held in long-term memory, the results of exploring a particular sub-node are 
held only in short-term memory. At the same time relevant new information 
may be coming in that cannot be exploited while a particular sequence of sub- 
nodes is being successively investigated. But, if short-term memories have been 
deleted while the problem ‘incubates’, it is possible to restart the process of 
investigation via a different sequence of sub-nodes and also, perhaps, to make 
use of newly acquired, but hitherto neglected, information. 

Simon also believes that there can be a logic, or normative theory, of scientific 
discovery. He does not mean, he says (p. 327), that there exists a process for 
deriving scientific laws deductively from particular empirical facts. But, he 
claims, one can instead devise pattern-discovery procedures for recording, in 
parsimonious fashion, sets of empirical data, and then one can have a normative 
theory of scientific discovery which is constituted by a set of criteria for evalu- 
ating such pattern-discovery procedures. Of course, the successful extraction 
of a pattern from given data does not guarantee its predictive value. But Simon 
is content to separate the question of pattern detection from the question of 
prediction and to regard the logic of discovery as being concerned only with the 
former question and not with the latter one as well. 

What is to be made of all this? The nature of scientific discovery is a legitimate 
topic of intellectual interest. It exercised early modern epistemologists, like 
Bacon and Leibniz, a great deal. But more recent philosophers of science have 
either tended to evade the issue, by consigning it to some other branch of 
learning, such as psychology or sociology, where in fact very few people have the 
right expertise to deal with it. Or they have dealt with the topic anecdotally, 
by eliciting superficially appropriate morals from a series of independent his- 
torical narratives. Simon has at least produced some kind of theory about 
scientific discovery, and there is something to be gained from the opportunity 
to criticise it. 

So far as his descriptive or explanatory theory is concerned, what is chiefly 
missing is any systematic attempt to check it against the facts. A few episodes 
are mentioned (Kepler on planetary revolutions, Einstein on special relativity, 
Poincaré on Fuchsian functions), and Simon claims that these conform to his 
theory. But no attempt is made to derive further conclusions from the theory— 
either about past events or about future ones—so that by checking those con- 
sequences we can confirm or disconfirm the theory. Again, a comparison is 
drawn between the search for new scientific knowledge and the search for a 
good chess move. Yet no attempt is made to substantiate this comparison by 
studying scientists’ protocols experimentally in the way that de Groot so 
revealingly studied chess-players’ protocols. Perhaps if Simon had made such 
an attempt he would have come up against some of the very real difficulties 
that underlie the ‘knownothingism’ which he rightly deplores. Such difficulties 
arise even if Simon is right in thinking that the formal structure of scientific 
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discovery is the same as that currently favoured for problem-solving in artificial 
intelligence. For example, the rules of chess are constant from player to player 
and from experiment to experiment. It is therefore possible to take moves as 
legitimate units in terms of which to measure depth or width of search, and 
to compute appropriate comparisons between different players or different 
types of chess-situation. How can similar measurements and computations be 
possible for scientific discovery, except perhaps within those very narrowly 
determined frameworks within which computer-controlled experiments are 
actually carried out? The terminology of one scientific enquiry is seldom suffi- 
ciently analogous to that of another for any well-defined units of analysis to be 
evident in both. Yet without being able to draw comparisons on the basis of 
such units—comparisons between what happens in one subject, place, period, 
or circumstance of research and what happens in another—the explanatory 
potential of Simon’s theory is severely restricted. 

Nor does Simon’s normative theory contain much promise of enlightenment. 
The chess-player’s problem is a constant one. The rules of the game are invariant 
and the player’s object is to find a way, in conformity with the rules, to check- 
mate his opponent. Hence chess-playing programmes can be gradually im- 
proved by incorporating into their search-strategy heuristic maxims that derive 
from the experience and expertise of great chess-masters. But the problems of 
science, and the frameworks within which solutions are to be found, are con- 
tinuously changing as the horizons of enquiry retreat. So the experience and 
expertise of great scientists cannot constitute a source of heuristic maxims that 
would be as useful for discovering patterns in nature as are the dicta and recorded 
games of great chess-masters for discovering good moves in chess. Rather, a 
worthwhile pattern-discovery programme for a particular set of scientific 
problems could at best be put together only when the major contributions to 
solving those problems had already been independently made by the pioneering 
efforts of individual scientists. Indeed something very like this is already 
achieved in contemporary science whenever a series of experiments is controlled 
and monitored by a computer and the resultant readings are digested into some 
visually displayed pattern. In each such case the real task for discovery is now 
to find an explanation for the patterns that have thus been revealed; and this is 
not a task that the computer can yet be programmed to perform. The analogy 
for chess would be a practice of changing the rules of the game, or the definition 
of a win, every time a programme were constructed that could play as well as a 
grandmaster. 

Moreover, even if we disregard this very substantial limitation on the 
practicable scope of pattern-detection programmes in scientific discovery, it is 
open to question whether criteria for evaluating pattern-detection programmes 
have much of a claim to be called a normative theory of scientific discovery, 
unless those criteria have at least some concern with predictive value. And this 
means that each particular set of data within which a pattern is found must be 
registered under some appropriate category. It is not the bare existence of 
certain patterns that has to be registered, but the fact that the patterns are found 
under certain circumstances, in certain processes, ete. Only thus can the replica- 
tion of the pattern become an object of intelligent enquiry. But how specific, 
or how general, is the category under which thig or that set of data is tg be 
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registered? For example, are the reactions studied in some animal-psychology 
experiment to be registered as the reactions of rats, of muridae, of rodentia, of 
mammals, or of what? A programme that would determine this question, and 
evaluate the hypotheses so generated, would scarcely fall short of constituting 
a process for deriving scientific laws deductively from particular empirical facts. 
Yet Simon rightly sees the latter to be an illusory ideal. What he does not seem 
to see is that the extent to which a pattern-detection programme does fall short 
of that ideal is the extent to which criteria for evaluating such programmes 
inevitably fail to provide a normative theory of scientific discovery. 


L. JONATHAN COHEN 
Queers College, Oxford 


PITCHER, G. [1977]: Berkeley. London: Routledge and Kegan Paul. £7.50. Pp. 
X1-{-2'77. 


This book seeks to be a careful exposition, rather than a new interpretation, of 
Berkeley’s philosophy. It begins with a very detailed’ analysis of several central 
themes in the New Theory of Vision. There are chapters on, e.g., distance 
perception and the heterogeneity of the objects of sight and touch. These are 
followed by chapter length discussions of a number of topics from Berkeley’s 
Principles and Three Dialogues. Pitcher tries to sort out and place before us each 
component (and particularly the putative hidden subcomponents) in Berkeley’s 
arguments. This can get complicated. For example: 


These two distinctions are quite distinct. It is no doubt true that there 
cannot be a case of immediate perception wo. tat, that is not also a case of 
immediate perception wyointer, and it is also true that many cases of 
mediate perception w. mt. are also cases of mediate perception w, inter. (a8 
in the glowing poker example). But mediate perception w. inter. is neverthe- 
less different from mediate perception w. int, because (to consider the 
case of vision) while mediate seeing w, mter. of æ can normally occur 
neither with immediate seeing wyo. inter. Of x nor with immediate seeing 
w/o. int, Of x, mediate seeing w. int. Of x, on the contrary, often occurs with 
both kinds of immediate seeing of x (p. 10). 


It is my impression that Pitcher understands himself to be presenting a 
critical exposition of what he takes to be Berkeley’s central metaphysical claims. 
Pitcher’s own principles of interpretation remain largely implicit with the result 
that the book is given the air of an unbiased, purely analytical account of 
Berkeley's arguments. Pitcher assumes without question that he is able to 
explicate Berkeley's words without paying much heed to the philosophical 
context in which they were written. Moreover, he often tacitly relies on ordinary 
language, or on what Thomas Reid called ‘first principles of common sense’, 
to provide the cutting edge to his own arguments. The consequences are 
disastrous. For example, in the course of an extended and careful discussion of 
abstract ideas, we read: ‘It cannot be maintained that nothing at all is needed to 
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constitute the word-world links . . .” (p. 87). Again: ‘Berkeley, I say, apparently 
sees no need to explain the word-world links; at any rate, he offers no account 
of how words are connected to their referents’ (p. 89). He concludes: 


I lament the fact that Berkeley takes this line, because, as I tried to show, 
some account, some analysis, of the relation that binds a general term to 
just the particulars it denotes (1.e., its referents) is called for. On this point, 
at least, it seems to me that the view of abstract idea theorists such as 
Locke is superior to that of Berkeley; for whatever the inadequacies of their 
doctrine may be, they at least see the problem and offer a solution to it, 
whereas Berkeley, apparently, does not even see it (p. 90). 


Like Locke, Berkeley of course does advance an account of general and particular 
terms and their referents. Indeed, there are passages in which Locke seems to be 
inclining towards the account usually attributed to Berkeley. However, Pitcher 
does not explain why a false solution to the very different question of word- 
world links is preferable to no solution nor does he tell us what the True Solution 
is. Berkeley has shown the inadequacy of Lockean abstractionism as an account 
of such links. Moreover, like most Cartesians and Malebranchians, Berkeley 
believes there are good reasons for rejecting abstractionism either as an account 
of word-world links or as an account of concept acquisition. That is why this 
tradition was driven to (some form of) occasionalism. Pitcher’s lamentation 
derives from another problem—his own ignorance of Berkeley’s philosophical 
sources. 

Similarly, Berkeley is said to ‘overlook a crucial distinction’ (p. 113) in the 
course of the attack on matter. Berkeley is guilty of “confusing an idea and what 
an idea is of’. This is the same so-called simple mistake Russell also accused 
Berkeley of making. Like many commentators before him, Pitcher must rest 
his case against ‘idealism’ on some clearer and more basic principle in terms of 
which the confusion is evident. So far as I can see, he merely follows the standard 
course of begging the question against Berkeley. Even G. E. Moore, who based 
his own famous refutation of idealism on this ‘crucial distinction’, decided in 
later years (for what were essentially Berkeleian reasons) that it could not be so 
readily drawn (see Moore [1942], p. 658). 

Pitcher’s version of Berkeley’s ‘Attack against Lockean Matter’ is also vitiated 
by an unexamined assumption. Pitcher assumes that Locke is ‘the Enemy’ 
(p. 91). This may be taken to be part of common-sense understanding for many 
Anglo-American philosophers, but it is not a self-evident truth. I should have 
hoped that Luce’s work from the 1930s would have created doubts. His 1944 
publication of the Philosophical Commentaries provides ample evidence that 
Berkeley is not simply refuting a scepticism he diagnosed as implicit in Locke. 
Popkin’s ‘Berkeley and Pyrrhonism’ ([1951-2]) indicates how indebted 
Berkeley is to Pierre Bayle for his formulation of the nature of scepticism. The 
esse is percipi thesis is formulated explicitly as a device to foreclose on that very 
dichotomy from which Berkeley claims scepticism and the search for a criterion 
for distinguishing appearance from reality arise. Moreover, it is puzzling that 
it should go unnoticed that Berkeley’s formulation of the primary/secondary 
quality distinction (in the Principles and the Three Dialogues) does not follow 
Locke. Not appreciating Berkeley’s concern with Pyrrhonism, Pitcher misses 
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the force of the argument in Principles §10 on the inseparability of colour and 
extension. There are suggestions that Pitcher has relied too heavily on Jonathan 
Bennett who, after Locke and Hume, is the most frequently cited philosopher 
(followed by Wittgenstein). Fortunately, a corrective to Bennett is available. 
See David Berman’s [1972]. 

There are even problems with the assertion that Locke is a ‘metaphysical 
dualist’. A philosopher who allows that matter may think subscribes to a weaker 
dualism than that characterised by Pitcher. Nevertheless, Pitcher finds Berkeley’s 
appeal (Philosophical Commentaries entry 293a)* to God’s powers ‘nothing but 
a version of the hated Lockean dualism’ (p. 169). Even if Locke is read as a 
strict dualist, when Berkeley considers ‘Bodies’ as divine powers he is not 
categorising them as inert matter.* In any case, while he rejects mind/matter 
dualism, Berkeley does accept archetypal entities. This requires an under- 
standing of Malebranche and the possibility of a dualism which does not contain 
material substance. The issues are not narrowly Lockean. The suggestion that 
Berkeley is best read as an advocate of a “Conception Theory’ (p. 175 f.) is 
interesting and ought to be developed. That it is not may rest on Pitcher’s 
refusal to consider Berkeley’s debt to Malebranche. 

A problem also arises in the discussion of mental acts. Berkeley is the inheritor 
of a complex scholastic tradition on this subject. The task of making sense of, 
say, Suarez, is formidable. It requires sorting out several ontologically distinct 
relations. I sympathise with Pitcher’s restraint—but talk about the ‘relation 
between Bill’s body, and say, a tree, when Bill kicks the tree’ (p. 191), is really 
of no help in elucidating these questions. 

In turning to the history of philosophy, some philosophers in the so-called 
analytic tradition take as their task the sorting out of arguments in total isolation 
from and ignorance of the problems which gave rise to them. One looks at a 
printed text which the title page suggests is by an author with a name known 
from university philosophy courses, one knows English, and so one can say 
what the text means. One also decides whether the arguments are sensible, 
sound, and valid. One does this without worrying whether one possesses an 
established text (a task for bibliographers), without appeals to the historical 
context (a task for historians), and without reference to secondary scholarly 
sources (a task for philologists). Pitcher is not committed to quite so purist an 
interpretive thesis. His reading of Berkeley includes references to Locke (and 
to a number of modern commentators). He seems to have read the Berkeleian 
corpus. But it is hard to fathom Pitcher’s remarkable selectivity. It cannot all be 
attributed to his use of an inadequate (he describes it as ‘superb’ (p. 274)) 
bibliography and his omission of T. E. Jessop’s [1973]. 

The last chapter is devoted to an account of Berkeley’s 1712 tract, Passive 
Obedience, and is so entitled. No effort is made to explain, e.g., (a) the general 
political significance of the subject in the period following the accession of 
William and Mary; (6) why a Protestant Irishman of English extraction might 
be writing on the subject; (c) Berkeley’s attitudes towards the Jacobites; or 


1 See the new edition of Berkeley's Philosophical Commentaries, transcribed and edited 
by George H. Thomas (Alliance, Ohio: Mount Union College, 1976). 

® See p. 216 of volume 2 of A. A. Luce and T. E. Jessop (eds.): The Works of George 
Berkeley, Bishop of Cloyne. London: Nelson, 1948-57. 
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(d) whether Berkeley is in radical disagreement with Locke on the right of 
rebellion. Berkeley holds that ‘an absolute unlimited non-resistance or passive 
obedience [is] due to the supreme civil power’, (Passive Obedience, §2) and 
that our duty to obey negative moral precepts is absolute. I agree that Berkeley’s 
account of negative moral precepts is puzzling although I am not convinced that 
‘ “Do not steal” . . . is equivalent to the positive “Let other people keep their 
possessions” ’ (p. 230). (The example is Pitcher’s; Berkeley selects: ‘ “Thou 
shalt not commit adultery ...”’ (Passive Obedience, §3).) 

Berkeley maintains that if our consciences require us passively to disobey 
the sovereign, we must suffer whatever the consequences may be—even death. 
I agree that this is a hard doctrine. I too would reject it. But I am not impressed 
by Pitcher’s grounding his counter-argument in his ‘moral intuition about the 
kind of situation that existed in Hitler’s Germany’ (p. 246). The relevance of 
such abstract appeals to Hitler needs to be established. Moral purity in relation 
to a past of which one was not a part can usually be attained without too much 
effort. In fact, the moral intuitions of most Americans were not clearly anti- 
Hitler in the thirties (nor have they been notably hostile towards contemporary 
American foreign policy—except on pragmatic grounds). Americans may feel 
that on occasion other people have a moral duty to overthrow their leaders, but 
as I understand it, they have traditionally denied that Americans could have an 
obligation to rebel. However wrong-headedly, Berkeley was trying to deal with 
several concrete and immediately relevant attempts to specify the conditions 
under which rebellion by his fellow citizens would be appropriate, whereas 
Pitcher has given us a totally abstract and historically emasculated interpretation 
of Passive Obedience. 

The problems of Passtve Obedtence were addressed some years ago by Joseph 
Tussman in his [1957]. He too found many difficulties with Berkeley’s doctrine. 
However, he read Berkeley more sympathetically with reference to those utili- 
tarian considerations Berkeley sought to discount and to the difficulties that 
await our efforts to formulate a right of rebellion. T'ussman also noted that those 
who drafted the First Amendment to the U.S. Constitution had some sense for 
the force of negative moral precepts when they wrote: “Congress shall make no 
law ... abridging the freedom of speech . . .’ and in addition, that Americans 
have seen how effective those utilitarian considerations which Berkeley dreaded 
have been in eroding that and other rights. 

I believe that Pitcher chose to conclude his study of Berkeley with an examina- 
tion of Passive Obedience because his own moral views are so different from 
Berkeley’s. After mapping philosophical terrain which he believes no sensible 
person would ever care to traverse, but which for some inexplicable reason 
he thinks must be surveyed, Pitcher for the first time in the book displays some 
real concern with his material. With respect to obedience doctrines, people have 
been tempted by views similar to Berkeley's. Happily, he feels, these views can 
be shown to be as perverse and as contrary to common sense as the rest of 
Berkeley's philosophy. 

This claims to be a book on Berkeley. Nevertheless, Pitcher ignores the 
historical and philosophical context within which Berkeley wrote as well as most 
of the philosophers who influenced him. By including discussions of Locke, 


1 Passtve Obedience appears in volume 6 of The Works of George Berkeley (see n, 2). 
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Pitcher seems prepared to allow the relevance of some ‘external’ material to our 
understanding of Berkeley’s text. But it is certainly not self-evident that Locke 
is either the only or the most important philosophical influence upon Berkeley. 
One could try to make a case that the philosophical influences of, say, Male- 
branche and Bayle, was minimal in the construction of Berkeley’s arguments. 
Pitcher has done nothing of the sort. He has totally ignored them. So long as 
philosophers persist in assuming without question that a significant number of 
common themes bind Locke, Berkeley and Hume, or in considering Wittgenstein 
their guide to the nature of seventeenth and eighteenth century sceptical argu- 
mentation, or more generally, in believing that one can discover and evaluate 
a philosopher’s arguments while ignoring the philosophical scene within which 
the philosopher worked, we must look forward to books of this sort. I am not 
making a plea for any one interpretation of Descartes or Locke or Berkeley. I am 
not making a plea for any one philosophical style or approach or method. The 
philosophers of the seventeenth and eighteenth centuries are a fascinating group. 
They need and deserve to be subject to reevaluation from a wide variety of points 
of view. The basic tools for these tasks are readily available in any university 
library. It is unfortunate that Pitcher chose not to use them. 


HARRY M. BRACKEN 
McGill University 
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EHRENBERG, W. [1977]: Dice of the Gods. Dundee. G. C. Stevenson (Printers) 
Ltd. £2. Pp. 110. 


In this book, published posthumously, the distinguished physicist Werner 
Ehrenberg attempts to set the conceptual innovations of modern quantum 
physics in an historical context, tracing the development of the concepts of 
causality, necessity and chance from their beginnings in the fifth century B.C. 
to their treatment by twentieth century physicists and philosophers of science. 
The story is told as simply as possible. No thinker is despised for having 
‘got it all wrong’. On the contrary, each thinker’s positive contribution is 
patiently elicited. The project is ambitious and unfortunately Professor 
Ehrenberg did not live to revise and amplify the work. Consequently, some of 
his summaries of thinkers’ approaches oversimplify or even distort. Thus the 
final paragraph of page 96, which purports to give an account of Sir Karl 
Popper’s approach, gives the false impression that Popper (Popper?) considers 
metaphysical beliefs to be ‘futile’ in science and disproof of scientific theories 
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to be ‘feasible’. There is no mention of Popper’s attempt to cater for objective 
chance or of his approach to quantum physics. 

The main theses for which Ehrenberg argues are in outline as follows. 
Indeterminacy is a fact about nature rather than a fact about the interaction of 
quantum phenomena and observers. One can therefore accept quantum physics 
without giving up the belief in ‘the complete detachment of the subject of research 
from its object’. Chance and causality are both objective. Causality, however, 
is to be distinguished both from predictability and from necessity. Predictability 
is concerned with the capacities of the observer; indeed it is argued that the 
complete unpredictability of molecular and sub-molecular processes is a con- 
sequence of determinism. Ehrenberg examines the ordinary-language and 
everyday concept of causality and rightly points out that we look for causes 
where there are deviations from the ordinary run of events. His analysis on 
pages 70 ff. connects interestingly with that in H. L. A. Hart and A. M. Honoré: 
Causation in the Law, Oxford University Press, 1959. Necessity attaches to the 
laws of nature whether they relate to causes or not. In Ehrenberg’s words (p. 74) 
‘when we have an effect that is an event which singles itself out like a picture in a 
frame, we shall find a cause. If affairs just run on, we can normally point to 
some laws of nature’. 

The notion of forked causality is interestingly deployed and Ehrenberg 
speculates that human choices ‘utilise the forking of the causality of elementary 
events’ (p. go). It would be nice to have a really good argument to substantiate 
the bold claim that we can ‘bridge the gap between the atomic order and the 
world we live in.’ 

The book has many trailers for scenarios which have yet to be written. It is 
sad that Professor Ehrenberg did not live to write the full story. Copies of the 
book are available through the Physics Department, Birkbeck College, Univer- 
sity of London, Malet Street. 

I have noticed a number of misprints, the only serious one being ‘mobility’ 
for ‘probability’ on page 27. 

NEIL COOPER 
University of Dundee 


SLOMAN, À. [1978]: The Computer Revolution in Philosophy: Philosophy, Science 
and Models of Mind. Hassocks, Sussex: The Harvester Press. £12.50 
Cloth, £5.50 Paper. Pp. xvi-+304. 


After reading this lively book I found myself wishing that I had left the job of 
reviewing it to a professional philosopher. I would love to have read his response 
to claims such as the following: ‘Within a few years, if there remain any philoso- 
phers who are not familiar with some of the main developments in artificial 
intelligence, it will be fair to accuse them of professional incompetence’ (p. 5). 
‘Progress in philosophy (and psychology) will now come from those who take 
seriously the attempt to design a person’ (p. 13). ‘By thinking about possible 
mechanisms underlying fairly common abilities we can reveal the poverty of 
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most philosophical and psychological theories about the nature of mathematical 
concepts and knowledge’ (p. 183). Nobody could accuse Dr Sloman of undue 
modesty, nor of toadying to his philosophical colleagues. A late convert (by his 
own account) to the game of computer programming, he is understandably eager 
to share with all comers the insights that the new (and painful) discipline has 
brought him, 

Unfortunately, as he himself admits on page 270, the title of his book is 
somewhat misleading (the subtitle is more accurate). Much of it, especially in 
part I, has little or nothing to do with computers. Forty pages are devoted to 
“The Aims of Science’, arguing that science is concerned with establishing and 
explaining possibilities rather than laws. A further thirty-odd discuss the rela- 
tionship that Sloman sees between science and philosophy, and the nature of 
conceptual analysis. Not until page 103 do we have the main issue addressed in a 
short chapter entitled “Are Computers Really Relevant?’. The tone is promissory; 
but I am not sure whether the second Part will satisfy those readers ‘made very 
uncomfortable, if not positively antagonistic, by [the author’s] remarks about the 
role of computing and computer programs in philosophy’ (p. 103). Part II 
gives a lucid “Sketch of an Intelligent Mechanism’, outlines some computational 
approaches to ‘Intuition and Analogical Reasoning’, ‘Learning about Numbers’ 
and ‘Perception’, and concludes with a wide-ranging thirty-page survey entitled 
‘AI and Philosophical Problems’. Despite the tone of his earlier claims, the 
author is frank in admitting the poverty of actual achievements in these areas. 
In relation to perception, for example, ‘our present ignorance is not a matter 
of our not knowing which theory is correct, but of our not even knowing how to 
formulate theories sufficiently rich in explanatory power to be worth testing 
experimentally’ (p. 226). 

It would be a pity, however, if this book were to “turn off” philosophers 
curious to know whether they are missing something by ignoring ‘AI’. The 
central problem, which is certainly of philosophical interest, is that of the 
organization of intelligent action. 'This leads directly to the question—also 
philosophically deep—how far it is possible to specify explicitly procedures and 
constraints sufficient to ensure that their product will be an action of a required 
type. One way of testing the adequacy of a procedural account of a human 
performance is to simulate it on a computer and see whether the product has 
the required characteristics. Hence most practitioners of AI tend also to be 
computer programmers. 

The question inadequately answered by Sloman’s book, as I see it, is how 
far along this chain of activities it is necessary to draw all professional philoso- 
phers who do not wish to be dubbed “incompetent”. I happen to agree with 
him that many of the qualitative notions of automation engineering and 
information-flow analysis (including such key concepts as evaluation, which I 
think he underemphasises) provide such an essential conceptual repertoire for 
the understanding of human action that a philosopher ignorant of them would 
be needlessly handicapped. I think he is right too in suggesting (p. 200) that 
if Ryle had been more familiar with the information-engineering notion of 
parallel and interacting procedures he might have had less reason to reject that 
of multiple inner mental processes. All this, however, is far from showing that 
what philosophers need to master is the technology of Al at the computer 
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programming level. Although no one could guess it from the total absence of 
references to such pioneers as Craik, Wiener and McCulloch, and indeed to 
practically all the foundational workers before 1960, the fact 1s that the basic 
requirements for the organisation of intelligent action were hammered out in the 
408 and 508 without any recourse to the kind of digital computer modelling 
described by Sloman, and can be perfectly well absorbed by people with no 
specialist knowledge of computer programming. To turn these basic ideas into 
testable digital programs has undoubtedly been a major intellectual under- 
taking with its own intrinsic interest; and as Sloman shows, new insights into 
cognitive mechanisms are continually emerging in the process. But unless the 
philosopher in search of illumination intends to specialise in the field, to lure 
him into the dreary wastes of the AI programming literature would seem as 
heartlessly inept as to recommend burning down his house to roast a pig. 
Sloman’s book deserves to be read and enjoyed as a sampler of an area of human 
self-analysis that is engaging some of the best brains of our time—not as a 
pointer to the ideal career for a modern philosopher. 
D. M. MACKAY 
University of Keele 


Hotpcrort, D. (ed.) [1978]: Papers in Logic and Language. Coventry: University 
of Warwick. £1.50. Pp. 177. 


This volume consists of a collection of papers first delivered at a most enjoyable 
conference at the University of Warwick in 1976. The papers deal with a variety 
of topics in the philosophy of logic—from rigid designators to future contingents. 
I shall mention them all, the space devoted to each reflecting my interest rather 
than their merit. 

Two of the papers, Hookway’s and Parret’s, deal with what is communicated 
by an utterance as opposed to the truth value of that utterance. Hookway is 
concerned with the structure of communication by means of language seeing 
the use of language by a speaker as a means of conveying what the speaker 
desires the hearer to believe that there is a reason to believe. Thus, to use 
Hookway’s example, uttering the sentence ‘the temperature is 2°C’ the speaker 
may mean to convey to the hearer that the ice is too thin for skating. What in 
fact an utterance conveys will depend on context, speaker’s and hearer’s know- 
ledge etc. No set of rules can be expected to emerge from such an investigation 
as the above example indicates. 

The connection between Hookway’s paper and Grice’s notion of conversational 
and conventional implicatures will be obvious. Parret investigating these notions 
attempts to distinguish between conventional and conversational implicatures 
and to give as far as possible a classification and systematisation of them using 
a theoretical background of pragmatic deduction. How successful this is I leave 
to those who are more familiar with its subject matter. 

Read’s paper uses Donnellan’s distinction between the referential and 
attributive uses of definite descriptions in his discussion of referential opacity. 
In the first part of the paper he argues that Quine is wrong in maintaining that 
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scope differences could remove ambiguities over referential opacity, for the use 
of an existential quantifier occurring either as a main connective or embedded 
in a larger (possibly intentional) context implies that no reference in Donnellan’s 
sense is being made at all. 

Lewis contributes an interesting paper on exceptionable generalisations 
‘dogs have tails’, ‘lying is wrong’ etc. His main concern is to provide an account 
of the truth conditions of such sentences, After investigating and rightly rejecting 
several interpretations he concludes that it is only by an appeal to our background 
knowledge and understanding of each individual exceptionable generalisation 
that we can determine the truth value of that generalisation. Although I think 
this solution is on the right lines in rejecting any uniform account of such 
generalisations it does over-look the fact that the generalisations do not occur 
in a vacuum. They are presumably uttered by someone who wishes to convey 
something. If a speaker says something of the form ‘Fs are Gs’ we are entitled 
to ask what he means before agreeing or disagreeing with the generalisation. 
Sometimes a universal ‘all Fs are Gs’ will be intended, sometimes ‘most F's are 
Gs’ and sometimes some theory-laden statement. Take, for example, ‘dogs 
have tails’, Someone uttering this sentence could use it as a genuine universal 
exceptionless statement as in ‘dogs are animals’ but much more likely as a state- 
ment of frequency ‘most dogs have tails’. If the utterer has some knowledge of 
species and genetics he may mean that it is genetically determined that dogs 
have tails (which statement could presumably be expanded into a frequency 
statement about genes). The point is that we cannot settle the truth value of 
‘Fs are Gs’ even when we know that such a statement is intended as exception- 
able without asking for further information. In practice this is exactly what we 
do: if someone says ‘women do not play chess as well as men’, don’t we want 
to know whether that person intended the statement as an observation about the 
frequency of female Grand Masters or as the result of genetic inevitability? 

Jeffrey’s paper concerns future contingents, taking seriously the idea that a 
proposition may have a third truth value—‘undetermined’. Such a course is not 
new but what is new is the introduction of two operators ‘J’ and ‘E’. ‘I’ we are 
told is to mean ‘ineluctable’ rather than ‘necessary’, so that we may say that 
every proposition about the past (and the present?) is ineluctable whereas it is 
possibly the case that not all propositions about the future are ineluctable, 
though some may be. A second operator ‘E’ is to mean ‘eventually becomes true’ 
so that a proposition about the future though not now true will eventually 
become true or false. Thus in place of the law of excluded middle we have 
Ep V` | Ep. A model-theoretic semantics is then given for a formal language 
containing propositional variables plus the usual logical constants of the proposi- 
tional calculus. How we are to judge the worth of this enquiry depends on how 
seriously we take the claim that future contingents are not now true or false 
but have a third true value and also how well we can interpret Jeffrey’s ‘ineluct- 
able’ informally, for no matter how well worked out the model-theory we shall 
eventually have to match it against some informal notion. As a result of the truth 
tables chosen by Jeffrey, not only is ‘p V ~] p’ not valid, but none of the usual 
tautologies are valid. Informally this must surely be incorrect for it is not the case 
that ‘if it is warm and sunny tomorrow then it will be warm tomorrow’ will 
eventually become true for it is simply now true. 
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Bell’s paper is a further contribution to the debate started by Davidson’s 
‘Truth and Meaning’ paper. It contains arguments both against Davidson and 
against Dummett’s recent criticisms. The arguments presented here seem as 
valid as this vague talk of ‘theories of meaning’ and ‘theories of understanding’ 
allows: these topics together with the jargon they have bred, are in need of 
radical deflation. 

Plantinga’s paper sets out to give an interpretation of possible world talk so 
that it does not imply that there are things that do not exist. An ingenious 
argument eventually reaches the conclusion that ‘although there could have been 
some things that do not in fact exist, there are no things that do not exist but 
could have’—a fitting conclusion to an argument which relies on the necessary 
existence of possible worlds, properties and essences. Plantinga clearly lives in 
a possible world of his own. 

The most interesting and challenging paper in this collection is undoubtedly 
Müller’s ‘What’s in a numeral?’ in which he argues with style and verve against 
Kripke’s distinction between rigid and non-rigid designators. A continuation 
of the argument provides reasons for adopting the radical thesis that numbers 
and other abstract objects can change. Of course there are those, Dummett for 
example, who define abstract objects as those that can neither cause nor be the 
subject of change; in which case there would be no abstract objects in that sense. 
The crux of Miller’s argument is the symmetry of identity statements. Given 
that the number of known planets is now g but was at some time 7, what reason 
have we for saying that the number of known planets has changed but that 7 
has not?; for surely we can say that since 7 was the number of known planets 
but is no longer, 7 has changed. Similarly, Kripke’s suggestion that the number 
of planets could have been other than what it is while the square root of 25 
could not be other than 5 is dismissed for we may say equally that the number 
of planets could not have been other than the number of planets while the square 
root of 25 could have been the number of planets. Attempts to deal with these 
arguments by scope differences Miller refutes quickly and Quinely. Even if 
Miller’s argument that abstract objects can change can be refuted, as I believe 
it can, it must provoke the defenders of rigid designators into giving a more 
thorough elucidation of the notion of rigid designators than it has had so far. 
If only the notion had received as much elucidation as it has had application! 
This paper deserves to have a wide readership: I hope, therefore, that the modest 
price will encourage its circulation. 

A. J. DALE 
University of Hull 


BARWISE, JON (ed.) [1978]: Handbook of Mathematical Logic. Amsterdam: 
North-Holland Publishing Co. D.fl 190.00, $75.00. Pp. xi+r165. 


This compendious volume consists of thirty-one articles which are collectively 
designed to provide an overview of the present state of mathematical logic. As 
Barwise indicates in his Foreword to the book, it is primarily intended for mathe- 
maticians but several of the articles—especially the introductory ones at the 
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beginning of each part--could be read with profit by philosophers with some 
knowledge of mathematical logic. 

The book is split into four parts, corresponding to the traditional quadri- 
partite division of mathematical logic into Model Theory, Set Theory, Recursion 
Theory, and Constructive Mathematics (including Proof Theory). Each part 
begins with an article pitched at a fairly elementary level intended to acquaint 
the reader with the basic essentials of the area in question. This is followed by a 
series of articles discussing, often in considerable depth, various important 
aspects of the subject. 


Part A. Model Theory 


This begins with Barwise’s lucid introductory article on First-Order Logic. 
Next comes Keisler’s account of the Fundamentals of Model Theory, which in 
the space of fifty-odd pages gives a systematic overview of many of the most 
important facets of the subject, including some of the more recent developments, 
for example, recursively saturated models, stable theories and model-theoretic 
forcing. This is followed by Eklof on Ultraproducts, Macintyre on Model 
Completeness and Morley on Homogeneous Sets. Next is Stroyan’s somewhat 
off-beat account (entitled “Elements of Infinitesimal Calculus’) of some of the 
basic notions of Nonstandard Analysis. He focuses attention on the application 
of Nonstandard Analysis to differential geometry, providing among other things 
an interesting reformulation, in non-standard terms, of the leading ideas in 
Gauss’s famous 1827 work, ‘General Investigations of Curved Surfaces’. In 
addition, he gives a brief but informative account of the tangled history of 
infinitesimals and how Nonstandard Analysis finally enabled the notion to be 
put on a sound basis. This is the only article on Nonstandard Analysis in the 
book, and I feel that perhaps it would have been more appropriate, if the article 
is intended as an expository account of its subject, to have described less complex 
applications (e.g. to general topology or topological groups) than the one given 
here. The section continues with Makkai’s article on Infinitary Logic, which 
among other things provides a useful account of admissible sets and the Barwise 
Compactness Theorem. Finally, we have Kock and Reyes’ article on Categorical 
Logic, in which the authors furnish a compressed account of the category- 
theoretic approach to model theory. This article presupposes a considerable 
knowledge of category theory and universal algebra, but is nonetheless a helpful 
survey of an area that has developed with astonishing rapidity over the past few 
years. 


Part B. Set Theory 


This opens with Shoenfield’s expository article on Axioms of Set Theory. In 
essence, this is an expanded version of the remarks the author made at the 
beginning of the chapter on Set Theory in his well-known book (Mathematical 
Logic, Addison-Wesley, 1967), and attempts to justify the axioms of Zermelo- 
Fraenkel set theory on the grounds that they hold for the notion of ‘set as member 
of the cumulative hierarchy’, or, equivalently, ‘set constructed by (logical) 
stages’. Of course, this is not the only way of justifying the axioms of set theory, 
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and it is at least debatable whether the notion of set discussed by Shoenfield 
was what Cantor, for example, actually had in mind. But Shoenfield’s account of 
how the axioms of set theory can be derived from his notion of set is really very 
lucid and convincing. Next comes Jech’s article on the Axiom of Choice; this 
is pitched at a fairly elementary level and includes a sketch of the ideas behind 
the consistency and independence proofs for the axiom. This is followed by 
Kunen’s account of infinitary combinatorics and large cardinals and Burgess’ 
article on Forcing. The latter is a very compressed and somewhat indigestible 
account and I feel that the author has attempted to cram rather too much into it. 
Also there is no mention whatsoever of the Scott-Solovay approach to forcing 
via Boolean-valued models. Next comes Devlin’s article on Constructibility: 
in addition to a proof of Gédel’s classic result that the Axiom of Choice and the 
Generalised Continuum Hypothesis ate relatively consistent, it includes an 
account of some recent work of Jensen and others on the structure of the con- 
structible universe. One rather strange fact is that the author has omitted to 
make any reference to the important notion of relative constructibility. Also I 
think it would have been helpful to compare the constructible universe with 
Russell’s ramified hierarchy of types, which latter, according to Gödel himself, 
provided the original motivation for his creation of the notion of constructible 
set. The section concludes with Rudin’s discussion of Martin’s Axiom, a technical 
hypothesis concerning ordered sets which is of considerable use in establishing 
independence results in set theory, and Juhasz’s article on Consistency Results 
in Topology. 


Part C. Recursion Theory 


This begins with Enderton’s excellent expository account of the basic ideas of 
the subject. Next comes Davis’ article on Unsolvable Problems which includes 
a presentation of the (negative) solution to Hilbert’s Tenth Problem. This is 
followed by Rabin’s account of Decidable Theories, which, among other things, 
contains a number of interesting results on the complexity of solvable decision 
problems. Then we have Simpson on Degrees of Unsolvability, Shore on a- 
Recursion Theory (f.e. recursion theory on admissible ordinals), Kechris and 
Moschovakis on Recursion in Higher Types, and Aczel on Inductive Definitions. 
The section concludes with Martin’s account of applications of logic to descrip- 
tive set theory. 


Part D. Proof Theory and Constructive Mathematics 


First comes Smorynski’s lucid account of the Incompleteness Theorems, 
followed by Schwichtenberg’s article on the role of Gentzen’s method of cut- 
elimination in proof theory. Next we have Statman’s treatment of the notion 
of a direct proof, t.e. one which contains no term more complex than all those 
occurring in the theorem proved. This is followed by Feferman’s excellent 
account of Theories of Finite Type, which are restricted versions of set theory 
approximating more closely to actual mathematical practice. After this comes 
Troelstra’s discursive article on Constructive Mathematics: this includes an 
explication of the intuitionistic logical operations and a brief development of the 
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theory of real numbers along constructive lines. 'This is followed by Fourman’s 
chapter on the Logic of Topoi, which contains a straightforward introduction 
to the theory of toposes and a presentation of the author’s work on toposes as 
models of higher order intuitionistic logic. (Roughly speaking, a topos is a 
category in which one can interpret higher order terms.) Next, we have 
Barendregt’s article on the lambda calculus, in which the primary object of 
study is the applicative behaviour of functions (and not, as in category theory, 
just their behaviour under composition). The volume concludes with something 
of a scoop: the first published account of Paris, Harrington and Kirby’s im- 
portant joint work on mathematical incompleteness in first-order arithmetic. 
Here a true arithmetical statement S of mathematical practice (i.e. one not depen- 
dent for its meaning on numerical coding of notions from logic) is found which 
is independent of first-order arithmetic. Moreover, this statement—a slightly 
amplified version of the well-known Finite Ramsey Theorem—is mathematically 
simple and interesting. The proof that S is independent of the axioms of first- 
order arithmetic is difficult, and only a sketch is given. However, there is no 
doubt that the result is sufficiently important to merit inclusion in a survey 
volume of this sort. 


It will be seen from the description of the book’s contents that it offers a very 
wide-ranging survey of the present state of mathematical logic. The standard 
of exposition in the articles, is, on the whole, high, and in some cases, outstand- 
ingly so. Its price (and bulk) make it unlikely, in my opinion, to become a very 
widely circulated book, and I am uncertain as to the extent it will achieve its 
intended—and laudable—purpose of ‘sharing with the mathematical com- 
munity some modern developments in logic’ (Editor’s Foreword). What I am 
sure of is that it is an excellent reference volume for mathematical logicians, 
and one that makes for fascinating browsing. 


JOHN BELL 
London School of Economics 


Tuorre, W. H. [1978]: Purpose in a World of Chance. Oxford University Press. 
£3-95- Pp. xii+123. 


Professor Thorpe is careful to pin responsibility for his new book onto his 
publishers (p. vi), and they tell us (on the jacket) that the idea 1s to offer ‘a short 
answer to Jacques Monod’s widely discussed work, Chance & Necessity’. 
Though Thorpe does not present it as such, that answer is to be found, I think, 
on page 117. What we must do, thinks Thorpe, is embrace Christian ideals, or 
else ‘terrible things may happen to mankind, far more terrible even than con- 
centration camps and atom bombsg’ (the words are quoted from Heisenberg). 

Thorpe does not try to champion this claim as it stands, but his book is best 
construed, I’m sure, as a contribution to its defence. It is an attack on what 
Thorpe refers to variously as ‘reductionism’, ‘mechanism’, ‘materialism’, or 
‘mechanistic-materialism’, and these are worth attacking, he clearly thinks, 
because of the threat they pose to ‘Christian ideals’. 
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The first obstacle to taking Thorpe’s campaign seriously is that nowhere do 
we get a clear enough view of the enemy. What, for example, does Thorpe 
understand by ‘mechanism’? 

The argument of his first chaptet is that Monod is silly to defend a ‘mechanis- 
tic’ construal of biology because even physics is now seen as non-mechanistic 
(p. 9). So, one must concede, it is—in a certain sense of ‘non-mechanistic’. But 
almost certainly Thorpe has more than this in mind, for he seems happy to 
gloss ‘non-mechanistic’ as ‘non-physical’ and to gloss ‘non-physical’ as ‘mental’, 
‘purposive’ or ‘spiritual’ (p. 115). And while it may be true that modern physics 
is in one sense non-mechanistic, it at least has to be argued for that it is com- 
mitted to the non-mechanistic construed as the ‘mental’ or the ‘purposive’ or 
the ‘spiritual’. 

The same lack of clarity about ‘mechanism’ vitiates another of Thorpe’s 
arguments. Having given an account of some of the more sophisticated of 
animal capacities—approaches to linguistic behaviour and the capacity to count, 
in particular—Thorpe invites us to admit that ‘we find ourselves increasingly 
convinced that, astounding though the mechanisms studied and revealed by 
anatomists and physiologists are, something more than what we call “mechanism” 
is required to account for some at least of these overwhelming performances of 
animals and their evolutionary development’ (p. 75). 

The only argument I can discern in what Thorpe says as supporting this 
intuition is that the development of the capacities discussed oblige us to accept 
that ‘behaviour is always one jump ahead of structure; and that habits, traditions, 
and behavioural inventions must have played an ever increasing role in the 
evolutionary story as the animals mounted the ladder of complexity’ (p. 75). 
(A view which Monod accepts, incidentally: ‘in man more than in any other 
animal... it is behaviour that orients selective pressure’.1) An explanation of 
why this conclusion is thought incompatible with ‘what we call “mechanism” ’ is 
not provided, nor any further specification of what ‘what we call “mechanism” ’ zs. 

A second obstacle to taking Thorpe’s discussion seriously is the hazy grasp 
it betrays of the positions to which it attempts to relate. An important instance 
is his attribution to the mind-brain identity hypothesis of the tenet that a person 
is conscious of all the neural activity that goes on in the cerebral cortex (p. 92). 
Elsewhere, he wisely adds the restriction in brackets that this tenet belongs to 
the identity hypothesis ‘at least in its earlier and rather naive forms’—and even 
more judiciously omits to make any references to such naive formulations (p. 
87). He speaks also of the identity theory as postulating that ‘conscious ex- 
periences run parallel with brain events’ (even apparently referring to it as 
‘parallelism’) and says that it denies that ‘consciousness can exert any effective 
action on neural happenings’ (p. 92). It would be remarkable if insight managed 
to triumph over these misunderstandings and I don’t think it does. 

A third obstacle is Thorpe’s weakness for leaving out the heart of an argument. 
He expounds Whitehead’s system in support of his contention for the importance 
of ‘purpose’ in the Universe, only to deal with the crux like this: 


For him [Whitehead] Reality is Process. An entity, as it is to itself, is a 
subject; but as it appears to us is an object. From this, Whitehead leads 
i Jaques Monod: Chance and Necessity, p. 152. 
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on to one of the most fundamental elements of his beliefs, namely that 
subjective aim is postulated because, without it, no entity could even exist 
(p. 107). 

For this interesting, and for Thorpe central, thesis, no argument from 
Whitehead is reported; we are simply assured that he ‘leads on’ to it. 

There are plenty of lesser obstacles. A good deal of the value of Monod’s book 
lies in its excellence in rendering scientific findings accessible to the lay person. 
Thorpe, on the other hand, succeeds only in irritating when he tries to join in. 
His attempt to explain the Krebs cycle from scratch in the space of half a page 
is commendable only for its ambition (p. 8, 9), and both in this example and his 
even briefer account of alarm substances in ants he appears to assume that 
enormously complicated diagrams are just the thing to clarify a murky exposition. 

Thorpe is also prepared to send his reader in pursuit of a previous book of 
his to find details of books referred to importantly in this one. In several places 
(e.g. pp. 82, 100) he refers the reader to earlier on in the present book when as 
far as I can see he must have a previous one in mind. 

But what when all the obstacles have been circumvented? What of Thorpe’s 
argument when one does arrive at the point of taking it seriously? 

Let me say what I think it is: purposes, argues Thorpe, play a role in our 
world in a number of ways. Their role is obvious in human beings’ behaviour, 
but it is also there in the behaviour of quite lowly creatures like ants, grubs and 
wasps (p. 38). Purpose or purposes also play a part in determining the course 
of evolution, in virtue of the fact that creatures’ behaviour, including purposive 
behaviour, constitutes part of the framework within which natural selection 
operates. To the extent that what Thorpe reports from Whitehead can be 
accepted (he does not make this extent clear), purposes also play a role in sustain- 
ing in existence not only organisms but also entities in general. And finally 
Thorpe hints that purposes must also be invoked to account for the origin of 
life, or of ‘self-conscious mind’, on the grounds that all other explanations fail. 
Now, in attributing a purpose to a thing, Thorpe says he is attributing to it an 
‘intention to attain some end’, and an ‘intention’ he thinks of as necessarily 
something of which the intender at least might be conscious. 

The claim he is apparently seeking to establish therefore is the all-pervasiveness 
of the capacity for consciousness, a capacity to be found not just in human 
beings, but also in lower animals, plants, the simplest organisms and even in 
lumps of inanimate matter. And his idea seems to be that establishing this would 
refute ‘materialism’ (‘mechanism’, ‘reductionism’ or whatever). 

Against this line of argument I would suggest first that while the phenomenon 
of consciousness does indeed present a difficulty for the type of reductionism 
represented say by Monod, its being all-pervasive would not compound the 
difficulty. The problem is that of understanding how a whole that is capable of 
consciousness may be made up of parts which severally lack that capacity. But 
once you are satisfied that such an understanding can be reached, the difficulty 
has been surmounted, and any ‘discovery’ that consciousness is more widespread 
than might have been thought will be beside the point. 

But secondly, Thorpe’s argument for the conclusion that consciousness is 
all-pervasive is broken-backed. On the one hand he argues that empirical work 
shows that we must think of ants, grubs, wasps and birds as activated by true 
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purpose, and on the other expounds—-and one must assume commends—the 
view (attributed to Whitehead) that everything must be so activated as a condi- 
tion of its very existence. But it is surely useless to argue that such and such a 
special sort of behaviour evidences purpose, if you go on to claim that behaviour 
of all other kinds is activated by purpose too. 

But there are more interesting features of Thorpe’s book than the arguments 
to be discovered there. 

There are unconsidered assumptions, interesting for their vintage. For 
example he clearly thinks (in 1978) that the highest praise that can be showered 
upon the indiginous religions of Africa and America is to observe that they 
contain ‘germs of beliefs having close similarity to some of the major tenets of 
Christianity’: praise which Thorpe thinks himself forward enough thinker to 
confer without reservation. And a page earlier we have the declaration that many 
scientists are kept going only by their belief in perfect knowledge, beauty and 
goodness. The development over the last few decades of important and ramifying 
critiques of both these assumptions have clearly passed Thorpe by. 

Then there is the way in which Thorpe allows the citing of authorities to 
substitute for argument. The mind-brain identity theory, says Thorpe at one 
point, has ‘suffered as a result of the remarkable investigations . . . by Sperry 
and his associates’. But instead of saying how (as Monod would certainly have 
done!) we are simply told, in the baldest terms, what overall view Sperry happens 
to maintain, which happens to be Thorpe’s. (And Thorpe incidentally does not 
even see the appropriateness of giving a reference to Sperry in the bibliography.) 
We find exactly the same pattern in the discussion of the problem, as ‘Thorpe 
identifies it, of how ‘the will’ sets in train brain events that issue in action. “There 
have been some challenging suggestions’, he tells us, involving ‘the postulated 
action of the will on a few neurons or even on a single neuron’. But ‘this is too 
technical to warrant our following it through here’ (p. 94). Instead, we are 
favoured with another ‘expert’ opinion. (Is it perhaps somebody else’s ‘authorita- 
tive opinion’ he is relying on when he asserts the view that Whitehead’s and 
Russell’s Principia Mathematica is ‘one of the great intellectual monuments of 
all time’ (p. 104)—the one that told him that what this work attempted was ‘to 
establish mathematics as the basis for a reformed formal logic’?) 

Thorpe’s technique of invoking authority when argument for one reason or 
another fails him is of course not idiosyncratic. It is entirely characteristic of a 
certain genre within Christian apologetics to which his book, as I have already 
suggested, belongs. It is the mode of discussion that scientists generally (there 
are outstanding exceptions of course) think appropriate when matters of religion 
are broached. Perhaps one should conclude that their recourse to this mode 
merely reflects their conception of the character of philosophical thought. But 
there is a second more interesting possibility which should not be neglected: 
that what we see displayed in books like Thorpe’s is not the scientist’s view of 
how to think philosophically, but scientific thinking itself. 


VERNON PRATT 
University of Lancaster 
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HISTORY AND PHILOSOPHY OF LOGIC 


This new publication will publish articles, notes and book reviews dealing with 
the history and philosophy of logic. It will appear as a volume, hopefully annually. 
By ‘logic’ is understood any body of knowledge which was regarded as logic 
at the time in question. “History” refers back to ancient times and also to work 
in this century; however, it will not publish articles, including review articles, 
on very recent work on a topic. ‘Philosophy’ refers to broad and general questions; 
it will not publish specialist articles which are now classed as ‘philosophical 
logic’. Articles on the relationship between logic and other branches of know- 
ledge will be considered, but the component of logic must be substantial. 

Articles on relevant collections of manuscripts and on (large-scale) projects 
about to start or in progress, and short notes on problems and issues in the field 
(such as requests for information, points about terminology, and so on) will also 
be published. It will also publish book reviews; these will be solicited by the 
editor. It will not publish articles on meetings, on teaching logic, or job and 
meeting advertisements. 

The editorial board has been designed to cover the main areas in the history 
and philosophy of logic. It comprises J. Ashworth, J. Berg, R. S. Y. Chi, 
J. Corcoran, G. Gabriel, I. Grattan-Guinness, S. Haack, A. Kratzer, C. Lejewski, 
J. Pinborg, M. Resnik, B. Smith and C. Thiel. The editor is I. Grattan-Guinness, 
34 Hillside Gardens, Barnet, Herts., EN5 2NJ, England, to whom manuscripts 
can be sent. It is hoped to publish the first volume of History and Philosophy 
of Logic during the first half of 1980. 


MACHETTE FOUNDATION LECTURES 


The Machette Foundation of New York has made a grant to enable public 
lectures to be given by visiting American philosophers in Universities and 
Polytechnics in Great Britain and Ireland. The grant is administered by a 
committee whose chairman is Dr W. Mays, Department of Philosophy, University 
of Manchester, Manchester M13 9PL, and University and Polytechnic Depart- 
ments interested in taking part in this scheme should communicate with him. 


SEVENTH BIENNIAL MEETING OF THE PHILOSOPHY 
OF SCIENCE ASSOCIATION 


Tentative Dates: 16-19 October 1980 

Place: Toronto, Canada 

The Philosophy of Science Association will hold its Seventh Biennial Meeting 
in Toronto, Canada. The meeting is tentatively scheduled for the third weekend 
in October. The program will include sessions of contributed papers as well as 
symposia and other special sessions. Contributed papers will be preprinted as 
Volume 1 of PSA 1980. The symposia and other papers will be published later 
as Volume 2. | 
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Contributed papers may be on any topic in the philosophy of science and 
written from any philosophical standpoint. Maximum length is 3500 words. 
Two copies, each including a 100 word abstract, should be prepared in double 
spaced typescript. Since blind refereeing will be used, the author’s name and 
institution should appear on a separate cover page. These materials should be 
sent to the Chairperson of the Program Committee: Professor Ronald N. Giere, 
Department of History and Philosophy of Science, 130 Goodbody Hall, Indiana 
University, Bloomington, Indiana 47401 U.S.A. The closing date for submission 
is 15 January 1980. 

PSA is primarily interested in papers that are ready for publication. However, 
papers involving work in progress which the author wishes to present at the 
meeting but not necessarily to publish in the proceedings will be considered. 
When submitting such a paper, the author should indicate that it is being 
submitted for presentation only. 

Author of accepted papers which are to be published will be responsible for 
preparing a camera-ready typescript. Instructions, which pertain to both style 
and format, will be mailed with the letter of acceptance. Since it would speed 
preparation of the final copy, potential contributors are encouraged to prepare 
their original manuscripts in accordance with these instructions. Copies may be 
obtained from the Program Chairperson or from the PSA Business Office, 
18 Morrill Hall, Michigan State University, East Lansing, Michigan 48824, 
U.S.A. The deadline for camera-ready copy will be 1 May 1980. Authors of 
papers accepted for presentation only will be asked to submit a 500 word abstract 
by that date. 
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Hidden Variables and the Propensity . 
Theory of Probability 


by JIM EDWARDS 


Let S, be the sentence: “The probability of this coin landing heads is 4.’ 
Propensity theorists of probability hold that S, reports an objective 
property of the world rather than the subjective partial knowledge and 
partial ignorance of the speaker. One propensity theorist (Popper!) holds 
that S, reports a property of each single toss (performed in accordance 
with certain defining conditions), rather than a property of a sequence 
of tosses (each performed in accordance with the same defining con- 
ditions). Another propensity theorist (Mellor?) holds that S, reports the 
display in a single trial of a dispositional property of a physical object, 
the coin or perhaps the tossing device, rather than a property of a sequence 
of tosses using the physical object. Sklar in his [1970] has presented a 
challenge to propensity theories which I propose to meet in the case of 
coin tossing. It will emerge that it is a Popper-like propensity theory, and 
not a Mellor-like theory, which, in my opinion, survives Sklar’s challenge. 


1 Suppose we have a mechanical tossing device, a newly minted coin 
and a resilient surface upon which the coin is to land. Suppose also a 
set of operating instructions for carrying out a trial—such instructions 
to include procedures to exclude variations in temperature, draughts, 
distortion of the coin, electrostatic and magnetic forces, etc. Suppose we 
find by experiment that in a long run of tosses the-relative frequency of 
heads is close to 4, and the longer the run of tosses the closer the relative 
frequency is to $. Given these experimental results, we would (provision- 
ally) accept S, as a report of our findings. 

Given our (provisionally) accepted theories of mechanics, of motion 
through a viscous medium, of the forces produced by partially elastic 
impact, and given the appropriate permanent properties of the tossing 
device, coin and landing surface, there is a set of parameters J;,..., In 
which are more detailed descriptions of the act of tossing the coin, each 
describing a state of the system (or part of the system) prior to the coin 
coming to land heads or tails, and such that the following is the case: 


Recetved 31 August 1978 


1 See especially Popper [1957], Popper [1959], Popper [1968], P. Schlipp [1974], pp. 
1132-5. 2 Mellor [1971], chapter 4. 
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Let X be a single toss of the coin conforming to the operating 
instructions. Let the (unknown) values of J,,..., J, in the trial X 
be J#,..., I, and suppose that X resulted in heads. If e,...,e, 
are sufficiently small numbers, then in a long sequence of trials 
defined by the operating instructions plus the added. conditions 
that | R—I, | <ex, | HD] lea -o a | I—II | <en the frequency of 
heads will be close to 1, and the longer the sequence and the smaller 
the numbers e4, . . . e, the closer the frequency will be to 1. 


Let us call this last sentence Sy. 

Of course, we do not know the values of 7%, ..., IZ in the case of the 
trial X. (A trial in which we could measure the values of J4, . .., J, would 
probably not conform to our operating instructions—the laboratory would 
be full of hardware—nor would it have the appropriate permanent 
properties—a super tossing device would have to be designed and 
substituted for our old tossing device.) Nor can we confirm the truth of S, 
directly by experiment, because we cannot produce a sequence of trials 
of which Są is true. Rather, our (provisionally) accepted theories, together 
with a premise reporting that the result of the toss À was heads, imply 
that S, is true. 

Some philosophers have held that the propensity theory, supposing it 
to be coherent, should be confined to examples of radioactive decay and 
the like, because coin tossing, dicing and the like are fully deterministic 
systems. Does Popper accept that coin tossing is a model of his propensity 
theory? Dicing is uncontroversially a model of the frequency theory and, 
in an early ([1959]) article on his propensity theory, Popper argues that 
the frequency theory, as applied to dicing, presupposes a propensity 
interpretation of the same examples. Hence Popper explicitly argues that 
dicing is a model of his propensity theory. But his argument could equally 
well have used the uncontroversial application of the frequency theory to 
coin tossing and would then have concluded that such an application 
presupposes a propensity interpretation of coin tossing. Furthermore, 
Popper’s writings on his propensity theory often use coin-tossing examples 
with no indication that they are merely illustrative and not to be taken 
literally. So I conclude that Popper does accept that coin tossing is a 
model of his propensity theory. 

Mellor says that coin-tossing examples are not ‘ideal’ for the propensity 
theory. He writes ([1971], p. 71): 

Propensities admitted in scientific theories must be connected to other properties 


in whose terms they admit of explanation. It thus happens that indirect evidence 
can occur for propensities as for other dispositions [viz. evidence which is 
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directly of these other connected properties of the set-up]. Thus it also happens 
that traditional gambling examples are not ideal in that there interest lies more 
in the trial than in the set-up. Serious sciences do not deal with the propensities 
of coins and dice, although something is known about them. 


What is lacking, Mellor believes, in coin-tossing examples, which keeps 
them from being ‘ideal’ examples of propensities, is a network of laws 
which ascribe the propensity to one or other of the coin or the tossing 
device. He writes (hid., p. 75): 


It is true that the chance distribution over the possible results of tossing a coin 
is affected by the properties both of the coin and of the tossing device. The 
propensity displayed may be ascribed to either. . . . The convention is unusually 
arbitrary in this case just because there is no serious science with a network 
of laws about coins or tossing devices into which the propensity can be fitted. 
The ascription of a propensity here either way may be taken to express a 
conviction that such a science is possible. 


Does Mellor think that enough is known for him to agree with Sa, 
despite our lack of ‘serious’ science of coin tossing? Mellor thinks that 
S is false, because he thinks that, although not an ideal example, coin- 
tossing is an example of the display of a propensity. Whereas to subscribe 
to Sẹ} Mellor believes, is to regard coin tossing as a fully deterministic 
system. Mellor writes (zbid., p. 153): 


The causal antecedents of a coin landing heads . . . all follow the act of tossing it. 
Of that act none is the effect, although the act may well have other effects. I take 
this much to be entailed by a coin toss being a chance trial, and I take 
determinism to deny it. A determinist will suppose a more detailed description 
to distinguish the trials that result in heads from those that result in tails. The 
result heads, when it occurs, is taken to be a causal effect of the act of tossing 
under some more detailed true description (e.g. Sklar [1970], p. 360). 


Giving determinate values to J, . . ., J, is giving a more detailed description 
of the act of tossing a coin, so clearly Mellor would regard accepting S, as 
accepting determinism to the exclusion of a propensity interpretation of 
the chance distribution.” 

Sklar would accept S,. Or, more accurately, he takes as an unstated 
premise in his argument the equivalent proposition that the limiting 
frequency of heads in the infinite sequence of tosses 1s determined by the 
distribution of values of the ‘hidden’ initial conditions, vig. the limiting 
frequencies of initial positions, velocities, angles, etc. For example, it is 
an unstated premise in the following passage (Sklar [1970], p. 364): 

1 Mellor gives and rejects a version of S, on pp. 154-5 of Mellor [1971]. I understand 


that in “The Case for Chances: reply to Professor Salmon’ Mellor had changed his view 
from that quoted. His later view is that chances are compatible with determinism. 
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What would the LRF [limit ‘of relative frequency] of heads be in an infinite 
sequence of coin tossings? That depends upon what the distribution of initial 
positions, velocities, angles, etc. would be. If we assume that they would be 
distributed uniformly relative to our usual spatial measures, then the objective 
probability will be well-defined. It will in fact be 4. 


The argument which I wish to extract from Sklar (sbid., pp. 362-6) 
proceeds by comparing the propensity theory with the frequency theory. 
The frequency theory presupposes a given sequence of tosses. In that 
given sequence there will be a determinate (but unknown) distribution 
of values of the parameters J;,..., J,. The propensity theory, on the 
other hand, applies to a single toss. The propensity is a dispositional 
property of this single toss (Popper), or of the coin or tossing device used 
in the single toss (Mellor). The ascription of a propensity to a single toss 
implies, roughly, that 1f the toss were repeated an infinite number of 
times, then the limiting frequency of the outcome would have a certain 
value. The crucial difference between the frequency theory and the 
propensity theory, for Sklar’s argument, is that the propensity theory 
applies directly to a given single toss, which propensity is measured by a 
frequency in a possible sequence of tosses, whereas the frequency theory 
applies directly to an actual sequence of tosses. Sklar holds, as S, implies, 
that the relative frequency of an outcome in a sequence is determined by 
relative frequencies of the values of J,,..., Jan In the actual sequence 
considered by the frequency theorist, the parameters J, . . ., 1, will have 
determinate (but unknown) relative frequencies. But, Sklar asks, what 
relative frequencies will the values of J,, . .., 2, have in the merely possible 
sequence considered by the propensity theorist? To decide this Sklar uses 
another premise, which I also accept. He writes (1bid., p. 362): 

I submit the following: our only constraint upon what would be true of such 
infinite sequences in the realm of pure possibility is what is true in the actual 
world as a matter of law. 

Sklar’s final premise is that the distribution of the initial conditions 
I,,..., J, in any finite actual sequence of tosses is completely un- 
constrained by any lawlike features of the actual world whatever. Sklar 
concludes that the attribution of a propensity to a single toss is not 
well-defined. (Sklar’s actual conclusion is that the concept of propensity 
must be relativised to distributions of initial conditions. But this implies 
the conclusion I have attributed to Sklar since no propensity theorist, 
and certainly not Popper nor Mellor, do so relativise their concept of 
probability.) Sklar’s conclusion is not merely that without such a specifica- 
tion of distribution we cannot determine epistemologically what the 
propensities are, nor merely that we must assume such a distribution if 
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propensity is to be a guide to rational action. His conclusion is stronger. 
It is that without such a specification of distribution the concept of 
propensity is not well defined (1bid., p. 363). 

I accept that the argument is sound but reject the conclusion. So I must 
reject one or more of the premisses. Mellor, as we have seen, rejects 
S, and Sklar’s equivalent premise. I accept S, and reject instead Sklar’s 
final premise. I claim that the parameters J, . . ., L, in any actual sequence 
of tosses are distributed in a law like manner. I shall defend my claim 
shortly. If it is accepted then Sklar’s conclusion no longer follows. If the 
distribution of values of J, ..., J, in actual sequences is found by experi- 
ment to be similar, the more similar the longer the sequence then, in 
the normal way, we may conjecture that sequences of tosses, as a matter 
of law, have such a distribution. And if the distribution of values of 
I, ..., £, is lawlike and not accidental, then Sklar permits us to project 
that distribution onto posstble sequences of tosses. Hence a propensity, 
directly applied to a single actual toss and measured by a frequency in 
a possible sequence, will be well-defined. 

To meet Sklar’s challenge I must show that, in actual finite sequences 
of tosses, experimental results show that the distribution of values of the 
parameters J;, ..., Jn are similar, the more similar the longer the sequences. 
I have admitted that in no actual toss do we know what the value of any 
one of J,,..., Z, is. But Sklar adds a further difficulty for me: the 
distribution of values of J,,..., J, will be relative to the measures used 
for the parameters J,,..., In, and the choice of a measure is arbitrary. 
For example, if the parameter I, could take values A or B or C then a 
uniform distribution would asign a relative frequency of 4 to C. But if 
we reclassify, changing the measure to say that J, can take either the value 
(A or B) or the value C, then a uniform distribution would asign a relative 
frequency of 4 to C. Sklar writes (ibid., p. 364): | 
But where does this measure come from? T'o select them [that is, the measures 
for Z,,..., Ia] so as to obtain some antecedently desired results for the LRF 
[limiting relative frequency] of some particular outcome in the sequence would 


be viciously circular. I submit that there is no lawltke feature of the actual world 
which recommends one choice of measure over against any other possible choice. 


Returning to our tossing device, coin and landing surface, suppose that 
a finite but long actual sequence of tosses results in a relative frequency 
of heads of approximately g/r, and the longer the sequence the closer 
the relative frequency is to g/r. Suppose also, that the relative frequency 
in all other actual long sequences of tosses with relevantly similar tossing 
devices, coins and landing surfaces is also close to g/r, the closer to g/r 
the longer the sequence. These experimentally observed results would 
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lead us to conjecture that tossing devices, coins and surfaces with the 
relevant permanent physical properties give a relative frequency of heads 
in an infinite sequence equal to g/r as a matter of law. 

Suppose that our provisionally accepted theories of mechanics, projectile 
flight in a viscous medium and forces of non-elatic impact specify 
I, ..., 1, as the variable initial conditions and P}, ..., Pm as the relevant 
permanent properties of the set-up. The laws of these theories are 
expressed in standard c.g.s. measures, hence they will specify J,,..., J, 
and P,,..., P,, in c.g.s. measures. We know that the total set-up is 
capable of giving values to each of J,,..., J, only within a certain range. 
Certain #-tuples of values of J,, ..., [, within these ranges are physically 
possible. (Not all n-tuples of values within these ranges need be physically 
possible. For example, it may be that J, and J, can both take low values 
or both take high values from within their respective ranges in the same 
trial, but not physically possible for one to take a low value and the other 
to take a high value in the same trial.) Let the set of physically possible 
n-tuples of values of J,,..., J, be K. We do not know which n-tuples 
of values J}, . . ., I, are members of K. But we do know that each member 
of K is either head-producing (relative to the permanent properties 
P,,..., Pm of the set-up) or tail-producing (relative to the permanent 
properties P,,..., Pm of the set-up) and not both. Let the set of head- 
producing members of K be K, and the set of tail-producing members 
of K be K. The cardinality of K, and of K, is the cardinality of the 
continuum. Each physically possible n-tuple of values of J,,..., 4, is 
either a member of K, or a member of K, and not both. The sets K, and 
K, provide the needed measure. Let us say that each physically possible 
n-tuple of values of J,,..., J, either has the value K, or the value K.. 

Are the c.g.s. measures of the values of each of J,,..., J, arbitrary? 
The measures used are the same as the measures used in our laws of 
mechanics, projectile flight through a viscous medium and forces of 
partially elastic impact, because it is only relative to these laws (and to 
the permanent properties P,,..., Pm and to the operating instructions) 
that the possible #-tuples of values of J,,.... J, are determined as head- 
producing or as tail-producing. So the measures of J,,..., J, are no 
more arbitrary than are the measures used in those theories. And the 
measure K, or K, is determined in a non-arbitrary way by the measures 
of I... J, and the theories, permanent properties and operating 
instructions. 

Moreover, given our experimental results, we find that the proportion 
of n-tuples which are K, to n-tuples which are K; in long finite sequences 
of tosses is g/r, and closer to q/r the longer the sequence. We can conjecture 
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that this is a lawlike regularity for sequences whose set-ups have the 
properties P4, ..., Pm and which are produced according to the operating 
instructions. Nor is the attribution of the proportion of q initial condition 
of the value K, to r initial conditions of the value A; in an infinite sequence 
visciously circular. That proportion is determined by experiment, by 
taking the limit of the proportions shown in finite sequences of increasing 
size. 

Hence when we come to consider a merely possible finite sequence we 
may project that the proportion of initial conditions having value K, to 
initial conditions having value K, will be close to g/r, the closer the 
longer the possible sequence. And finally, in the case of a merely possible 
sequence of infinite length we may project that the values of the #-tuples 
of initial conditions are distributed in the measure K,, K, in the proportion 
q/r. Thus Sklar’s challenge is met. 

I suspect that this, in principle, rather obvious solution was missed 
because it was masked by one of Sklar’s correct claims. I agree that the 
values of any one of I, . . . 4, are not constrained by any lawlike feature 
of the world whatever. Suppose that J, is the vertical component of the 
force imparted to the coin by the tossing device. Suppose that the limit 
of the relative frequency of heads for this set-up 1s 4. We may not infer, 
from anything I have said, that the values of J,, measured in c.g.s. units, 
even have a limiting relative frequency. There is no justification for 
supposing, as many philosophers have supposed, that the values of J,, are 
uniformly and randomly distributed within the appropriate range. Certainly, 
as a human tossing device I have tried systematically to toss a coin as 
high as I can. I might adopt a policy of, say, ten high tosses followed by 
five low tosses, and I believe that the outcome would still show a lawlike 
random distribution of heads and tails with a relative frequency of heads 
of 4. Yet the values of J, would not be uniformly distributed, there would 
be two peaks in the distribution, one near the top of the range and the 
other near the bottom, about each of which the values of J, would cluster. 
Nor would the variations in value of J, be random since there is a regular 
cycle of 10 high values followed by 5 lower values. The same, I believe, 
holds for any of the other parameters J, .. ., J,. 

In the case of the coin which lands heads with a limiting relative 
frequency of g/r, the distribution of values of each of I,,..., In, measured 
in c.g.s. units, may vary in a way not subject to any law from sequence 
to sequence. Yet the ratio of head-producing n-tuples to tail-producing 
n-tuples is stable and close to g/r for each of the sequences. The evidence 
for both these assertions is experimental observation. I suspect that this 
is possible because, given any head-producing #-tuple, a minute change 
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in the value of any one of J,,..., J, will turn it into a tail-producing 
n-tuple, and, given any tail-producing n-tuple, a minute change in any 
one of I;,..., Ip will change it into an head-producing #-tuple. 

Another way of expressing the lawlike behaviour of the set-up is to 
say that when operated according to the operating instructions to produce 
long sequences of trials it ‘selects’ head-producing n-tuples of values of 
I, ..., J, approximately in the proportion q/r, the longer the sequence 
the closer the proportion is to g/r. This propensity, or the propensity of 
which this is the display must be attributed to the whole of the set-up, 
as Popper claims, and not to one or other part of it, as Mellor claims. For 
the #-tuples of values of J,,.... I, are classified as head-producing or 
as tail-producing relative to the permanent properties P,,..., P, which 
will include properties of the tossing device, the coin and the landing 
surface. Even if all the parameters J,,..., Ip applied to, say, the coin 
alone, their classification into head-producing and tail-producing would 
still be relative to such factors as the elastic properties of the landing 
surface and the relative positions of the coin when J,,..., J, apply and 
of the landing surface. In practice, since these relative positions will not 
be the same from trial to trial they will appear among the parameters 
I,,..., I, and not among the permanent properties P}, ..., Pm 

Finally in this section, some comments on a passage in which Sklar 
comes closest to considering the propensity theory I have been developing. 
He writes (tbid., p. 365): 

We can, of course, now design a “‘dispositional” theory [#.e. a propensity theory] 
that takes account of these actual frequencies, defining the probability of an 
outcome in a given “next” experiment, relative to a kind of that experiment, as 
what would be the LRF [limit of relative frequency] in an infinite sequence of 
possible experiments of that kind in which the long-run distribution of the hidden 


variables was just what the actual distribution of values of these variables is in 
the finite set of experiments of that kind in the actual world. 


Sklar comments on this suggestion (1bid.): 


But this would be a long way from Popper’s theory. In addition, it is hard to 
see what the point of the excursus into possibilia is, since actual relative 
frequency of the outcome in the finite class of experiments of the given kind 
will serve our purposes equally well. Once we have resorted to finite classes, 
there seems little point in abandoning the actual for the possible. 


The crucial differences between the propensity theory I have been 
developing and the propensity theory Sklar outlines in this passage are 
firstly that what is projected in my theory from the actual finite sequence 
to an infinite possible sequence is not the actual distribution of the values 
of J,,..., 4, measured in standard c.g.s. units (which, I think, like Sklar, 


Hidden Variables and the Propensity Theory of Probability 323 


is unstable and therefore changing as ‘the finite set of experiments of 
that kind in the actual world’ grows) but the distribution of head-producing 
and tail-producing n-tuples (which is found experimentally to be stable) 
and secondly that the projection from actual to possible sequences is 
sanctioned by the lawltke stability of the distribution. Sklar’s second 
comment could be applied to the theory I have been defending and it 
raises an important question. But it is a different question: Sklar’s challenge 
was to show how a propensity could be well-defined; to enquire into the 
point of adopting a propensity theory is a different but related story, to 
which we now turn. 


2 The crucial point of. introducing propensities into our ontology, 
according to Tom Settle, is that the propensity postulated should be 
‘at least part cause of the outcomes’ (Settle [1975], p. 411). If the propensity 
does not play this causal and explanatory role, then the so-called 
‘propensity theory is ‘little more than a frequency theory embellished 
with an abbreviative concept’, t.e., ‘a theory in which the propensity 
concept summarizes possible outcomes’ (tbid.). 

Settle would regard my propensity theory as deterministic and conse- 
quently as not being a full-blooded propensity theory. He writes (tbid., 
p. 410): ‘Let us call “deterministic” a law or theory which yields a unique 
value for each variable in its formulation when the values of the other 
variables are fixed.’ Clearly my S2 is in the spirit of this general formulation 
of determinism, applied to the particular case of coin-tossing, since, in 
an ideal limiting case, the values of J,,..., L, are fixed as Jf ..., IX and 
the outcome is an unbroken sequence of heads. A connecting argument 
to show that therefore my ‘propensities’ have no explanatory role to play, 
that they are not part causes of the outcomes, is most clearly expressed 
by Landé in his (1965), ch. II. Landé and Settle would argue that if the 
laws of the theory are deterministic in the sense given, then the theory 
can only yield a statistical distribution of outcomes if there is a correspond- 
ing statistical distribution in the values of the initial conditions Jj, ..., Jn, a 
point forcibly made by Popper ([1968], p. 208). But now the problem of 
giving an explanation of the statistical distribution of the outcomes is 
replaced by the analogous problem of explaining the statistical distribution 
postulated in the values of J,, ..., Z,. The statistical part of the cause of 
the statistical distribution of the outcomes is traced back to whatever 
caused the statistical distribution of the initial conditions J}, . . ., Jn. This 
cause is beyond the scope of the deterministic theories invoked to explain 
coin-tossing (the deterministic, ‘classical’, theories of mechanics, motion 
through a viscous medium efc.), which theories simply relate the statistical 
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distributions of the values of the input variables to the statistical distribu- 
tions of the values of the outcomes. So deterministic theories always fail to 
give an explanation of empirically observed statistical distributions, having 
always to postulate an earlier unexplained statistical distribution. Indeed, 
Landé argues in a spirit of reductio ad absurdum, a determinist is forced 
to postulate a malevolent demon who, in some remote epoch, so arranged 
conditions in diverse parts of the universe that, in the fullness of time, 
these diverse causal chains would converge to form that miraculous 
statistical distribution of values of J}, ..., J, which yields my empirically 
observed statistical distribution of heads and tails. 

However, unlike the determinist Laplace, ‘je n’ai besoin de cette 
hypothèse. My propensity theory makes no claim that the values of the 
variables I}, . . ., Zp, as measured in ¢.g.s. units, have a statistical variation 
during a sequence of trials. Indeed, I reported that in one experimental 
sequence of trials, the value of J,, the vertical component of the force 
imparted to the coin by the tossing-device, clustered around two values, 
one high and one low, and that these peaks in the distribution of the 
values of J, recurred in a regular cycle of ten high tosses followed by 
five low tosses, and yet the outcomes showed a 50:50 proportion of heads 
to tails randomly distributed. 

The sequence of results, heads to tails in the proportion g/r randomly 
distributed, shows that the proportion of #-tuples of J, . . ., I, having the 
value K, to n-tuples of J,,... 1, having the value K;, is g/r, and that they 
are distributed randomly. The values K, and Ķ; are defined on the n-tuples 
<I,,..., £,>, where J, ..., 4, are measured in c.g.s. units, by the theories 
of mechanics, projectile flight through a viscous medium, etc. given the 
permanent properties P}, ..., Pm of the trial set-up. Let us suppose, what’ 
is probably not the case, that the purely mathematical problems have been 
solved, that, given any n-tuple J,,..., 1, where the values of J,,..., J, 
are given in c.g.s. units, we can calculate, from the laws of mechanics, 
projectile flight in a viscous medium, ete., and from the values of P,,..., Py 
also given in ¢.g.s. units, whether that n-tuple is a member of K, or a 
member of K,. That is to say, suppose we have non-trivial specifications 
of the sets K, and K.. i 

I agree with Popper (1bid.) (subject to a quantification to be introduce 
shortly) that a statistical conclusion can never be derived from 
‘deterministic’ premises alone. We need, in addition, a premise assigning 
statistical distributions to the initial conditions. To explain the empirical 
law that the proportion of heads to tails in any long sequence of tosses 
each satisfying P}, ..., Pm is close to g/r, the closer to g/r the longer the 
sequence of trials, we need, in addition to our deterministic laws of 
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mechanics, projectile flight etc., a premise specifying that the proportion 
of n-tuples of J,,..., I, having K, to n-tuples of J,,..., I, having K, is 
close of g/r, the closer the longer the sequence of trials, where K, and K, 
are specified in a non-trivial manner. 

However, the explanation ends there. No explanation in terms of prior 
causes is called for of the empirical fact that the proportion of K, n-tuples 
to K, n-tuples in some long sequence is close to g/r, where K, and K, are 
non-trivially specified. Any practically possible long sequence of #-tuples 
of values of J,,..., Ip measured in c.g.s. units would do. .any sequence, 
that is, which is consistent with our general beliefs about how the values 
of J,,..., J, are caused and with our general beliefs about the circum- 
stances of coin-tossing, whether or not that sequence even has a statistical 
distribution. (This is not a retreat into some subjectivist theory of 
probability: merely an appeal to the uncontroversial empirical facts about 
coin-tossing.) I conjecture that the non-trivial specifications of K, and K, 
would be found to be such that any practically possible long sequence of 
n-tuples specified in c.g.s. units, would become, when reclassified as 
members of K, or of K,, a random-like sequence in which the proportion 
of members of K, to members of K, was close to g/r, the closer to g/r the 
longer the sequence. Landé’s demon is not needed because he has nothing 
to do. The sequences of values of J,,..., In measured in c.g.s. units are 
unremarkable, show no particular pattern, as would be expected. Only 
when they are classified by our non-trivial specification of K, and K, does 
the statistical pattern emerge. But that pattern is to be explained by the 
permanent properties P,,..., Pm and the laws of mechanics ete. which 
went into the specification of K, and KĄ, not by any prior causes. Indeed, 
if we used our non-trivial specifications of K, and K, to devise a long 
sequence in which the proportion of n-tuples belonging to K, to #-tuples 
belonging to K, deviated significantly from g/r then we should find, I 
conjecture, that the sequences of values taken by J,,..., In, measured in 
c.g.s. units, showed some extraordinary pattern which, when matched 
against the uncontroversial facts about coin-tossing, did require ex- 
planation. 

Popper’s dictum that a statistical conclusion requires for its derivation 
a statistical premise needs qualifying. From a non-statistical premise 
specifying a sequence of trials by a sequence of n-tuples of values for 
I,..., 2, in c.g.s. units and a premise giving non-trivial specifications of 
K, and K, we could draw statistical conclusions about the proportion of 
n-tuples having the value K, to n-tuples having the value K, and 
consequently about the proportion of heads to tails in the outcomes. 

The sequences referred to in statistical conclusions are indefinitely long 
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sequences—e.g. the proportion of heads to tails in a long sequence of 
tosses satisfying P,,..., P,, is close to g/r, the closer to g/r the longer the 
sequence. The non-statistical premise will therefore have to specify an 
indefimtely long sequence in c.g.s. units of n-tuples of values of I,,..., Ln. 
Clearly this cannot be done in extension but it can be done in intention in 
the following manner. Let the single toss X be a toss which results in 
heads and in which J,,..., J, take the values JZ, ..., IX, and the single 
toss Y be a toss which results in tails and in which J,,..., 1, take the 
values J?, . . . ZY. Then let the non-statistical premise specify a sequence 
which consists of g members having the values 1%, ..., JĀ followed by 
r members having the values JF, ..., IX repeated indefinitely. From this 
non-statistical premise, together with a premise giving non-trivial specifica- 
tions of K, and K, we can draw the conclusion that the proportion of 
heads to tails in the outcomes of a long segment is close to q/r, the closer 
the longer the segment. Of course, this is not yet a statistical conclusion 
because the results heads and tails are not distributed randomly in the 
proportions g/r. But we could give a non-statistical mathematical generating 
condition using the m-tuples <47, ..., IZ> and </f, ..., IS which would 
generate on indefinitely long sequence of n-tuples in the proportions q/r 
in random order. (Compare Popper’s fully deterministic mathematical 
generating condition for a random sequence of Ps and O’s in equal 
proportion ([1968], Appendix iv.)) From such non-statistical premises we 
could draw the statistical conclusion that the proportion of heads to tails 
in a long segment is close to g/r, randomly distributed, the closer to g/r 
the longer the segment. 

However, such counterexamples to Popper’s dictum are trivial and can 
be set aside. They can be set aside because one premise specifies an 
indefinitely long sequence of n-tuples by a mathematical generating 
condition. But our best conjecture about coin-tossing and other gambling 
devices in that there are no suitable non-statistical generating forces 
associated with set-ups satisfying P,,..., Pam. The physical counterpart 
of the mathematical generating condition would have to be a mechanism 
which determined the value in c.g.s. units of each n-tuple in an indefinitely 
long sequence, determined that the proportions having values K, to K; is 
qir, and determined that they appeared in random order. Contrary to 
all our thinking about gambling systems the mechanism would have causal 
links between a given trial and its predecessors and successors. T'o postulate 
such a mechanism would be a conjecture on a par with postulating Landé’s 
demon. Hence we can ignore such contrived counter examples to Popper’s 
dictum. 

I conclude that our best conjecture is that the statistical distribution 
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of q n-tuples of value K, to r n-tuples of value K, where K, and K; are 
non-trivially specified, 1s at least part cause of the statistical distribution 
of the outcomes, that this distribution of #-tuples is to be explained as a 
dispositional consequence of the permanent properties P,,..., P,, (t.e. as 
a propensity) and not in terms of some prior ordering of the initial con- 
ditions, and therefore that the propensity theory I have advocated is a 
full-blooded propensity theory. 


3 Ihave developed my theory in the case of coin-tossing, but of course 
I believe it applies more widely. It applies wherever a statistical distribution 
is to be explained by deterministic functional laws and by initial conditions 
which are not themselves the outcome of some chance process. 

Landé’s blade-experiment (Landé [1965], p. 27 ff.) appears to present 
a case which refutes my theory. In this experiment a ball is dropped onto 
a blade and the outcomes are the ball falling to the right of the blade or 
the ball falling to the left of the blade. In practice, the more symmetrical 
we made the conditions, the sharper the blade, the more spherical and 
evenly dense the ball, the more accurately its centre of gravity at the 
moment of release is positioned over the blade, the closer the outcomes 
approximate to a random sequence of left and right in the proportion 
50:50. In any actual sequence of trials there will be minute and unordered 
variations in the properties of the set-up and in the ‘initial conditions’ 
and my theory is not threatened. The problem, for me, is in the ideal 
case in which the ball is always a perfect rigid sphere of even density, the 
blade a geometrical line and the ball is released at a point vertically above 
the line. This case presents a problem because the initial conditions and 
permanent properties are invariant from trial to trial, yet, says Landé, 
we would still get a 50:50 distribution of right and left balls. But of 
course we need suppose a 50:50 distribution of outcomes in the ideal 
case only if we suppose, as Landé does, that classical deterministic 
mechanics is false of events on the macroscopic scale. If we suppose that 
classical deterministic mechanics is true of macroscopic events, then the 
outcome in the ideal case is that the ball will either rest balanced on the 
blade or bounce on the blade for ever. I do not find this result of Landé’s 
thought experiment repugnant to reason. For consider a perfect sphere at 
rest on an horizontal plane, the sphere is in contact with the plane at 
only one point, yet we easily suppose it remains at rest. Or consider a 
perfectly rigid sphere bouncing on a rigid horizontal plane, we do not 
need to suppose that the point of impact moves around the plane. So I 
find nothing in the idealised version of Landé’s blade experiment which 
requires either the rejection of classical ‘deterministic’ mechanics for 
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macroscopic events or the rejection of my propensity theory for macro- 
scopic statistical processes. 


University of Glasgow 
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Between Chomskian Rationalism 
and Popperian Empiricism 


by STEPHEN P. STICH 


I Chomskys Anti-Emprricism and Chomsky’s Rationalism 
(a) Some Assumptions about Language Learning 
(b) Chomsky’s Anti-Emptricism 
(c) Chomsky’s Rigid Rationalism 

2 Sampson's Argument 
Between Rigid Rationalism and Popperian Empiricism 
(a) The Sampson—Popper Language Acquisition Device 
(b) Four Ways to Change the Sampson—Popper Model 


Noam Chomsky’s rationalist account of the human mind has won many 
adherents and attracted many critics. What has been little noticed on 
either side of the debate is that Chomsky’s rationalism is best viewed as a 
pair of quite distinct doctrines about the mental mechanisms responsible 
for language acquisition. One of these doctrines, the one I will call rigid 
rationalism, entails the other, which I call antt-empiricism, but the entail- 
ment is not mutual. Rigid rationalism is much the stronger of the two. 
What is more, the argument Chomsky offers for rigid rationalism is quite 
distinct from the argument for anti-empiricism. In the first section of this 
paper I will set out what I take to be the most favourable interpretation 
of each of these doctrines, along with the argument supporting it. 

Until recently, one of Chomsky’s most energetic supporters has been 
Geoffrey Sampson. However, Sampson has now jumped ship and joined 
the ranks of the opposition. To make matters worse, Sampson did not 
leave the Chomskian camp empty handed. He absconded with the data. 
In an intriguing argument Sampson urges that the very data which 
Chomsky uses to support his rationalist theory of mind are in fact better 
evidence for a radically different theory of mind, an empiricist theory 
suggested by the work of Karl Popper and his followers.” In the second 

Received 23 February 1979 
1 See, for example, Sampson [1975]. 


3 The argument is set out in Sampson [1978]. For a more detailed version of the argument, 
see Sampson [1980]. References in the text are to Sampson [1978]. 
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section of this paper I will offer a critique of Sampson’s argument. The 
burden of my critique will be that Sampson has failed to note the distinction 
between rigid rationalism and anti-empiricism, and that his argument at 
best undermines the stronger doctrine while leaving the weaker untouched. 
In the last few pages of this section I will turn my sights on bigger game. 
Once we have seen why Sampson’s argument does not work against 
anti-empiricism, we can also see that the Popperian empiricist account 
of the mind is simply inadequate to the task of explaining how humans 
can learn language. 

The arguments of the second section leave us with the conclusion that 
the correct theory of language acquisition must lie somewhere between 
Chomskian rationalism and Popperian empiricism. In the third section 
I will explore some of this intermediate territory. It is not my purpose to 
elaborate a detailed map. Rather I want to draw attention to some of the 
unexplored possibilities for theories which are neither empiricist nor rigid 
rationalist. 


I CHOMSKY’S ANTI-EMPIRICISM AND CHOMSKY’S RATIONALISM 


(a) Some Assumptions about Language Learning 

At the core of the dispute between Chomsky and his empiricist critics is 
the question of what the mind must be like in order to account for our 
ability to learn language as we do. Chomsky develops his arguments for 
a rationalist theory against a background of assumptions about what goes 
on when a language is learned. What happens, according to Chomsky, is 
that the learner comes to have a tacit knowledge or an internal representation 
of the rules of a grammar. The grammar which the speaker tacitly knows 
is simply the grammar of the language he speaks, the very same grammar 
that a linguist studying the speaker’s language seeks to uncover. This view 
of language learning is hardly uncontroversial. The critics, and I have 
been among them, have a pair of complaints. First, it has been argued that 
the relation between a speaker and the rules of his grammar is not plausibly 
viewed as a species of knowledge or belief.! Second, it has been claimed 
that even if a language learner does acquire tacit knowledge of a set of 
rules which he uses in producing and understanding his newly acquired 
language, it is gratuitous to suppose that these rules will be the rules the 
linguist uses to describe the language.* I am inclined to agree with both 
of these objections. But having said this, I propose to simply ignore them 
Sa à example, Stich [1971], [1973], [1975]; Cooper [1975]; Nagel [1969]; Schwartz 

1969]. 


8 Cf. Stich [1972], [1973], [1975]. For an example of a theory of language comprehension 
which does not postulate an internally represented grammar, cf. Winograd [1972]. 
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and adopt the view, shared by Chomsky and Sampson, that the end 
product of language learning is a tacit knowledge of the rules of the 
grammar. It is a concession I make not merely for argument’s sake. For 
it now seems to me that a plausible account can be given of a notion of 
internal representation which does not make all internal representation 
a species of belief or knowledge.1 If this can be done, the first objection 
has been blunted. The second assumption, that the rules a speaker 
internalizes are the rules of a linguist’s grammar, is largely irrelevant to 
Chomsky’s anti-empiricism. It plays a more substantive role in Chomsky’s 
argument for rigid rationalism. But as we shall see, there is abundant 
reason to be sceptical about rigid rationalism even granting the assumption. 


(b) Chomsky’s Anti-Empiricism 

With Chomsky, let us grant that after learning a language a speaker has 
tacit knowledge of the rules that a linguist would use to describe his 
language. What can we say about the mental mechanism that mediates the 
acquisition process? This mechanism can be viewed as an input-output 
device where the output is simply the grammar that the speaker tacitly 
knows. The input is rather harder to specify with precision. It consists 
of all those features of the learner’s experience which are relevant to the 
language learning process. A priori, there is little more that can be said 
about the input to the language acquisition mechanism. Following 
Chomsky, let us give the input the label primary linguistic data. 

Now one of the claims Chomsky has made about the language acquisition 
device, and to my mind the most plausible, is that the device must invoke 
mechanisms that are incompatible with the empiricist view of the mind. 
The claim is a vague one, of course, since empiricists are hardly of one 
mind about the mind. Still, Chomsky has argued persuasively that there 
are significant family resemblances, as well as significant historical links, 
among the views of the mind advanced by empiricist philosophers and 
modern learning theorists in the behaviorist tradition. While the differ- 
ences among these various theorists are considerable, there is one argument 
suggested by Chomsky which is telling against any view of language 
learning that would plausibly count as empiricist. In an earlier paper, 
I dubbed this argument the rational sctentist argument.® 

The argument can be put as follows. Suppose we set a rational scientist 
the task of duplicating the achievement of a language acquisition device: 
we provide our rational scientist with a set of primary linguistic data, and 
1 Cf. Stich [19788]. 


3 Chomsky has developed this theme in many places. See, for example, Chomsky [1965], 
[1966], [1968], [1975]. * Stich [1978]. 
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she must produce the grammar that a child would come to tacitly know 
had the child been exposed to these data. Could the rational scientist 
succeed? The answer, Chomsky argues, is no. For consider the class of 
sentences that occur in the primary linguistic data provided. These will 
include (at most) all the sentences the child has ever heard along with all 
the sentences the child has ever himself produced. Suppose, for concrete- 
ness, that we focus on a particular child in an English speaking community, 
so that all the sentences in his primary linguistic data are in English. Let 
us call this particular class of sentences EPLD (for ‘English primary 
linguistic data’). Now at the end of the acquisition process the child will 
tacitly know the grammar of his dialect of English, call it GR-ENG. This 
grammar will generate each of the sentences in EPLD and assign to 
each one or more phonetic, syntactic and semantic descriptions. GR-ENG 
will also generate indefinitely many sentences which are not in EPLD, and 
assign each of them one or more phonetic, syntactic and semantic descrip- 
tions. ‘The child will be able to produce many of these sentences and to 
understand them in accord with the semantic description provided by 
GR-ENG. However, there are indefinitely many other grammars which 
coincide with GR-ENG about the sentences in EPLD but diverge from 
GR-ENG for sentences outside EPLD. Some of these alternative grammars 
will not generate the same class of sentences generated by GR-ENG. 
Others, while generating the same class of sentences, will assign sentences 
outside EPLD different phonetic, syntactic or semantic descriptions. 
Chomsky provides a number of hints about what such alternative grammars 
might look like, but the argument does not turn on the production of 
examples. It is a commonplace that a finite body of data can be accom- 
modated by indefinitely many theories, and the present point is just a 
special case of this commonplace. Indeed, it is a particularly obvious 
special case since the core of a grammar is a generative system, and it is 
trivial to show that there are indefinitely many generative systems that 
will produce a given finite class of tree structures while diverging outside 
that class. Given the existence of indefinitely many grammars, all agreeing 
in what they say about EPLD, how is our rational scientist to choose? 
Some of the alternative grammars will perhaps be excludable on grounds 
of simplicity, there being simpler grammars that handle the same data. 
But it is hardly likely that an appeal to simplicity will yield a single 
1 There is a bit of idealisation here, since no doubt the primary linguistic data available 

to the child will contain a sampling of non-sentences of various sorts. But the idealisation 

favours the empiricist, not his Chomskian opponent. If the empiricist has trouble 

accounting for language acquisition assuming clean data, he will have still more trouble 


when the data are dirty. 
* Cf. for example Chomsky [1975], pp. 32-33. 
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candidate. And, indeed, it is not clear that an appeal to simplicity, as 
judged by the rational scientist, will be of any help at all. To assume 
that it will is to assume that the child’s language acquisition mechanism 
prefers a grammar which is simpler, by the scientist’s lights, to one which 
is more complex, when both grammars agree in what they say about the 
data at hand. But that is an assumption for which there is neither evidence 
nor a priori argument. We conclude, then, that the rational scientist has no 
strategy available for selecting the grammar that the child actually acquires 
from the indefinitely large class of grammars which coincide with GR-ENG 
on EPLD, but diverge elsewhere. A bit fancifully, we can picture the choice 
confronting the rational scientist as in Figure 1. If we think of each point 
as a possible sentence, we can represent the class of sentences generated 
by a grammar as an oval. The scientist’s chore is to project from the 
sentences in HPLD in the same way that the child’s language learning 
mechanism does, that is, to project fom EPLD to GR-ENG. Without 
further information, the chore would seem to be impossible to accomplish, 
save by accident. 

To turn the rational scientist argument into an argument against 
empiricist theories of the mind, we need only note that the rational 
scientist has available to her all the theory constructing and inferential 
apparatus that would be attributed to the mind by any theory plausibly 
counted as empiricist. So if she cannot do what the child plainly can do, 
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we can conclude that the child’s mind contains some learning mechanisms 
that are not dreamt of in the empiricist’s philosophy. Note that the 
rational scientist argument by itself points to no positive account of what 
the language acquisition mechanism is like. Its conclusion is a negative 
one: whatever the language acquisition mechanism is like, it cannot be 
restricted to the mechanisms that empiricists attribute to the mind. It is 
this doctrine that I call Chomsky’s anti-empiricism. It is a doctrine that 
I endorse. 


(c) Chomsky’s Rigid Rationalism 

In tandem with his anti-empiricism, Chomsky has long urged a positive 
view about the nature of the human language acquisition mechanism. The 
problem for an acquisition mechanism, recall, is to discover the right 
grammar, the one the other members of the speaker’s community tacitly 
know, from the evidence of primary linguistic data. Mere compatibility 
with the data is not enough to insure the right choice, since there are 
indefinitely many logically possible grammars compatible with the data. 
But suppose the acquisition mechanism need not consider all logically 
possible grammars. Suppose instead that it is innately provided with a 
set of specifications which all human grammars must possess. Such a set 
of innate specifications would make the acquisition device’s chore consider- 
ably easier, since it would narrow down the class of candidates it need 
consider. Indeed, if the specifications were restrictive enough, it might 
turn out that only a single allowable grammar would be compatible with 
each naturally occurring sample of primary linguistic data. If the specifica- 
tions are not quite so restrictive, the acquisition device could still get by 
if it were provided with an innate ranking of grammars. It could then 
proceed down the ranked list of allowable grammars until it found one 
which was compatible with the primary linguistic data. This is the way 
Chomsky claims the human language acquisition device actually works. 
What Chomsky is urging, then,-is that the language acquisition device 
ig innately constructed so that it can output only a restricted subset of 
the logically possible grammars. There are certain quite specific and 
non-trivial features characterising all the grammars that can be output by 
the acquisition mechanism. ‘These features will thus be universal features 
exhibited by all actual grammars. Moreover, they would also be exhibited 
by any language that a human could learn in the normal way. As Chomsky 
notes, there is a strong rationalist flavour to this picture of language 
acquisition. The mind, or at least that part of it concerned with language 


1 There is no easy way to specify which universals are to count as ‘trivial’. What I am 
most concerned to exclude as trivial are highly disjunctive ‘universals’. 


Between Chomskian Rationalism and Popperian Empiricism 335 


acquisition, is innately predisposed to acquire knowledge of a very special 
sort. Indeed, the innate predisposition is so strong that the mind cannot 
normally learn languages lacking the universal features. I will call this 
doctrine Chomsky’s rigid rationalism. 

What are the arguments Chomsky offers for his rigid rationalism? I 
think there are two. First, rigid rationalism is an alternative to empiricism, 
and as we saw Chomsky has adduced a powerful argument to show that the 
correct theory of language acquisition must be a non-empiricist one. Of 
course, the mere fact that rigid rationalism is a workable alternative to 
empiricist theories is a relatively weak reason for adopting it. It would be 
a much better reason if one thought that rigid rationalism were the only 


available alternative to discredited empiricist theories. As I have shown 
elsewhere, Chomsky sometimes argues as though rigid rationalism were the 
only available alternative to empiricism. However, this is simply false; 
we will see in section 3 that there are many sorts of non-empiricist theories 
which do not follow the rigid rationalist in insisting that the grammars 
humans can acquire must exhibit common ‘universal’ features. 

The second, and much the more important, argument for rigid 
rationalism is an empirical one. Chomsky claims that there are in fact 
universals to be discovered in the grammars of known human languages. 
There are non-trivial common features shared by the grammars of all 
languages that have been sufficiently studied. Granted that there are such 
universals, how can their existence be explained? It is Chomsky’s con- 
tention that rigid rationalism provides the best explanation. If the 
acquisition mechanism is innately programmed so that it can output only 
grammars exhibiting certain features, then we would expect that these 
features would be present in all human grammars. Much of Chomsky’s 
work in linguistics has been aimed at formulating and testing hypotheses 
about linguistic universals.? 

This second argument, let us call it the best-explanation-of-universals 
argument, is susceptible to attack from two directions. First, the empirical 
premise that certain features are in fact universal might be ‘challenged. 
The forthright approach here would be to produce counter-examples to 
the putative universals, examples of languages which simply do not exhibit 
one or another of the features Chomsky claims to be universal. For those 
who are not accustomed to the honest toil of seeking out counter-examples, 
there is a less arduous way of questioning Chomsky’s empirical premise. 
A casual perusal of the literature of modern generative linguistics will 
quickly reveal that there is widespread and fundamental disagreement 


1 Cf. Stich [1978a], p. 281 and n. 10. 
3 Cf. for example Chomsky [1968] ch. 2, Chomsky [1972] ch. 1, Chomsky [1975] ch. 3. 
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about the grammars of even the most carefully studied languages. In the 
face of this disagreement [ am inclined to think that much speculation 
about linguistic universals is simply premature, particularly when the 
putative universals are very abstract features of the organisation of a 
syntactic theory. We are hardly in a position to conclude that all grammars 
exhibit a wide range of abstract universal features when we have not even 
a single grammar whose basic features are generally agreed upon by 
experts in the area. Moreover, as I have argued elsewhere, claims about 
abstract universal features in syntax should be particularly suspect, since 
the grammarian’s methodology has great potential for generating pseudo- 
universals.1 

A different strategy for attacking the best-explanation-of-universals 
argument is to urge a better explanation. The idea here is to grant that 
one or another feature does in fact appear to be universal among known 
languages, but to argue that there is a better explanation of this fact than 
the hypothesis of rigid rationalism. Both lines of attack have been used 
before in the literature. However, Sampson combines them in a new 
and interesting way. He couples a forthright attack on some putative 
universals with a novel alternate explanation of the universals which resist 
forthright attack. It is to his argument that I now turn. 


2 SAMPSON’S ARGUMENT 


Sampson begins his critique by surveying a number of putative universals 
of phonetics, syntax and semantics, and marshalling evidence that in each 
case either there are known exceptions to the alleged universal features, or 
there is some relatively trivial and unproblematic explanation of why the 
feature is universal, an explanation which does not invoke rigid rationalism. 
Though I fancy myself an informed amateur on matters linguistic, years of 
close contact with professional linguists have convinced me that philo- 
sophers do well to stay out of linguists’ internecine quarrels. Thus I 
propose to simply assume, for argument’s sake, that Sampson is right 
about the universals he considers. My thesis will be that even granting 
all his linguistic contentions, Sampson has neither damaged Chomsky’s 
anti-empiricism nor has he made Popperian empiricism even remotely 
plausible. But here I am getting ahead of myself. 

Sampson does not claim that all alleged linguistic universals are spurious 
or trivially explainable. On the contrary he allows that in syntax ‘Chomsky 
is correct in claiming that there are characteristics which appear to hold 
1 Stich [1972]. 


* For the first strategy, cf. Cooper [1975] and many references given in Sampson [1978]. 
For the second strategy, cf. Putnam [1967]. 
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of all known human languages’ (p. 188). The basic property which Sampson 
holds to be universal 


may be summed up in the statement that natural languages are hierarchically 
structured. The class of grammatical sentences of any natural language can 
be defined by means of a finite set of context-free phrase structure rules, which 
ascribe grammaticality to formative strings indirectly by ascribing well- 
formedness to hierarchical ‘phrase-markers’ dominating these strings, possible 
supplemented by structure dependent ‘transformational rules’, which modify 
strings in ways that depend exclusively on the nature of the hierarchical phrase- 
markers dominating those strings (p. 188). 


This universal feature of human syntax would appear to be ideal grist 
for the best-explanation-of-universals mill. It is a non-trivial universal 
for which there is no obvious explanation apart from rigid rationalism. 
It is just here that Sampson makes his bold move to steal Chomsky’s 
data. There is, Sampson claims, a better explanation of hierarchical 
structure than rigid rationalism. The explanation proposed relies on an 
argument due to Herbert Simon which shows that if a process 1s ‘formally 
akin to the evolutionary process described by Darwinian theory’, it is 
overwhelmingly likely that the systems produced by the process will have 
a hierarchical structure (p. 191).1 Hence, Sampson claims, 
those complex phenomena in the world which are created by processes formally 
akin to Darwinian evolution will as a matter of contingent fact be hierarchically 
structured, even if as a matter of logic they need not have been, and even if 


hierarchical structure makes them no fitter for survival once they have emerged 
than would be potentially non-hierarchical alternatives (p. 191). 


To complete the argument, Sampson need only add the ‘a priori plausible’ 
premise that ‘human languages have arisen as the result of a gradual 
evolutionary process from simple beginnings’ (p. 192). If so, then the 
hierarchical structure that was to serve as data for the best-explanation-of- 
universals argument has found an alternative explanation which is at least 
as plausible as rigid rationalism. 

Now it might appear that if we grant a Simon-style explanation of the 
hierarchical structure of grammars is a possible one, then we are left 
with a stand-off. Both Sampson and the rigid rationalist have possible 
explanations, and there is as yet no way of deciding between them. But, 
Sampson argues, this is not the case. For there is an important fact which 
is accounted for only by a Simon-style explanation. This is the fact that 
the only universals that have been discovered are those which would be 
expected in light of Simon’s argument. The rigid rationalist expects that 
some features will be universal, but he is in no position to predict what 


1 Cf. Simon [1962] for details of Simon’s argument. 
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these features will be. For the rigid rationalist, whether or not a given 
feature happens to be one imposed by our innate mental structure is 
largely a matter of accident. For Sampson, however, there is a principled 
reason to expect that certain features and only those will be universal, 
and the data confirm this expectation (cf. p. 195). 

We would do well to remind ourselves that there are a pair of contro- 
versial assumptions embedded in Sampson’s argument. The first is that 
a Simon-style explanation can in fact be extended to the case of linguistic 
structure. The second is that, as a matter of empirical fact, the only 
features of human language which are universal are either hierarchical 
features of the sort a Simon-style explanation would have us expect, or 
are readily explainable without appeal to rigid rationalism. If these two 
assumptions can be made to stick, then Sampson will have further 
weakened the case for rigid rationalism. For the best-explanation-of- 
universals argument is the principal support for rigid rationalism. And if 
we grant Sampson’s assumptions, there is a better explanation of universals. 
It’s simple Simon. 

Let us turn now to the other half of Chomsky’s doctrine, his anti- 
empiricism, and see what implications Sampson’s argument has in that 
quarter. Sampson boldly embraces a theory of language learning modelled 
on Popperian empiricism. It is his contention that Simon-style considera~ 
tions can be used to defend a Popperian theory against Chomsky’s 
anti-emipiricist attack. But in this view, I will argue, Sampson is quite 
mistaken. 

The Popperian view of learning that Sampson defends runs as follows. 
Learning is a gradual process in which hypotheses are formulated by the 
mind (possibly randomly) and then tested against experience. If a 
hypothesis is compatible with experience, it is retained for further testing; 
if it is falsified by experience, it is rejected, and the process begins again 
with a new hypothesis. The principal constraint on the new hypothesis is 
that it must be compatible with all previously acquired data.t As Sampson 
notes, this process of randomly formulating a hypothesis and then letting 
experience do its best to falsify it is parallel in form to Darwinian natural 
selection. This similarity provides the opening for Sampson to use his 
Simon gambit. Since a Popperian language learning mechanism is formally 
analogous to natural selection, we would expect the output of the 
mechanism to be hierarchically structured. So we need not postulate 


+ A more detailed sketch of the Popperian model for language learning is given in section 3, 
below. I should stress that in calling the model ‘Popperian’ I do not mean to suggest 
that Popper would endorse it. For more on this terminological point, see the last 
paragraph of this section. 
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anything more than a Popperian mind to explain the fact that a language 
learner ends up learning a hierarchically structured grammar. There is 
no need to adopt an anti-empiricist theory of learning to account for the 
fact that the grammars children acquire are hierarchical. 

Now it is my contention that, even if we grant Sampson’s claim about 
the sorts of grammars a Popperian mind would learn, he still has not met 
the challenge of Chomsky’s rational scientist argument. The fault I find 
with Sampson’s argument is simply that he explains the wrong fact. As I 
have set out the rational scientist argument, the crucial fact leading to 
Chomsky’s rejection of empiricism is not that children learn a hierarchical 
grammar, nor even that they learn languages exhibiting the range of 
features Chomsky claims to be universal. Indeed, the rational scientist 
argument does not presuppose the existence of universals of any sort, 
and it works quite as well if we suppose that there are no linguistic 
universals. What is crucial for the rational scientist argument is that 
children learn the right grammar. That is, they learn the grammar which is 
tacitly known by the senior members of their linguistic community. Of 
course, given our Sampsonian assumptions, the right grammar will be a 
hierarchical grammar. But just as there will be indefinitely many logically 
possible grammars compatible with the child’s primary linguistic data, 
so too there will be indefinitely many Aterarchical grammars compatible 
with the child’s primary linguistic data. What needs explaining is how the 
child’s acquisition mechanism comes up with the right grammar from 
this indefinitely large set. 

It seems clear that no theory of learning along Popperian lines can 
possibly provide a satisfactory explanation of the fact that the child learns 
the right grammar. For at each stage in the Popperian process of con- 
jecture and refutation there are indefinitely many wrong conjectures 
compatible with all the data previously encountered. If the learning 
mechanism simple selects randomly among these, as the Popperian 
analogy to random mutation in natural selection would suggest, then there 
is no reason to expect that children would learn the right language at all, 
let alone that they would do it relatively quickly and at much the same 
pace. If the child’s acquisition mechanism is Popperian, then we would 
expect that a few lucky children would hit on the right hypothesis early 
on, while the majority of children stumbled from one wrong hypothesis 
to another. 

My critique of Sampson can be made graphically with the aid of figurer. 
Suppose that GR-ENG along with GR-A to GR-C are a sampling of the 
indefinitely many hierarchical grammars compatible with EPLD, while 
GR-D to GR-F are a sampling of the indefinitely many non-hierarchical 
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grammars compatible with EPLD. At best Sampson’s Simon gambit has 
shown that a Popperian mind would be expected to acquire a grammar 
of the former category. But what has not been explained, and what I 
maintain simply cannot be explained, is how a Popperian mind could 
reliably learn the right grammar, the grammar which, for example, would 
generate S, rather than S,, Sz, So ete. 

At this juncture a Popperian might well question the assumption that 
children do learn the right grammar. How do we know that they tacitly 
come to know the same grammar tacitly known by their senior co- 
linguists? The answer is that the child’s understanding of and judgments 
about sentences outside his primary linguistic data match up, near enough, 
with the understanding and judgments of his elders. This coincidence 
of understanding and judgment does not, of course, entail that parent 
and child tacitly know exactly the same rules. Rather, what it entails is 
that the rules they tacitly know are equivalent in the weak sense that each 
set of rules generates (more or less) the same set of sentences and assigns 
them (more or less) the same linguistic descriptions. But even this 
relatively weak equivalence between the child’s tacit knowledge and the 
tacit knowledge of his elders is a fact that a Popperian theory of learning 
cannot explain. For, to recur to a now familiar theme, there are indefinitely 
many grammars (indeed, indefinitely many hierarchical grammars) com- 
patible with the child’s primary linguistic data but not equivalent, even 
in the weak sense, to the grammars of his elders. The Popperian acquisition 
device has no way of selecting among them. 

Throughout this section I have been referring to the ‘Popperian’ theory 
of mind and the ‘Popperian’ language acquisition device. But, as I noted 
earlier in a footnote, the label may well be misleading. The theory of 
language acquisition I have been criticising is Sampson’s not Popper’s, 
and to the best of my knowledge Popper has never addressed himself 
directly to the issue of language acquisition. Still, Sampson’s theory is 
clearly modelled on Popper’s theory of science as a game of conjectures 
and refutations. In light of the strong parallels between the two theories, 
we might suspect that analogues of the problems which scuttle Sampson’s 
theory will plague Popper’s account of science as well. Oddly, however, 
this is not the case, and it will be instructive to see why. On the Popperian 
account, science is the game of proposing theories, then attempting to 
falsify them by checking their predictions against what can be observed 
in nature. At any given time, the theory we should use as our working 
hypothesis is the strongest theory yet proposed that has not yet been 
falsified by nature. (This, of course, is little more than a caricature of 
Popper’s view, but it will do for the purposes at hand.) Now suppose we 
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try to pursue the strategy of the rational scientist argument against Popper’s 
view of science. To begin we note that at any given time there will be 
indefinitely many logically possible theories each compatible with (i.e. not 
falsified by) all the data that have been collected in a given domain. To 
narrow the choice, the Popperian proposes that the scientist should be 
guided by the strength of the hypothesis and by other methodological 
considerations. But now we can ask the same question we asked about the 
rational scientist trying to discover the right grammar: Why should we 
believe that these methodological directives will in fact lead us to the 
right theory, among the indefinitely many alternatives that are compatible 
with our data? It is just here that the analogy between language learning 
and the Popperian account of science breaks down. For, of course, the 
Popperian answer is that we don’t (or at least shouldn’t) believe that 
the theory favoured by our methodological rules is the right one, the one 
which truly describes nature, Quite to the contrary, accepted scientific 
theories are no more than working hypotheses which we expect will 
ultimately be falsified and replaced. Since most of the best hypotheses 
and theories in the history of science are demonstrably false, Popper, 
simply refuses to be embarrassed by the fact that the rules of his game of 
science may not lead to truth. But things are very different when our 
focus is on language learning. There, as we have seen, we have good 
reason to believe that the learner does get the right theory, viz. a grammar 
which is at least weakly equivalent to the grammar of his elders. And any 
account of the learning strategy the child uses must account for this fact. 


3 BETWEEN RIGID RATIONALISM AND POPPERIAN EMPIRICISM 


In the previous section I argued that Sampson’s Popperian mind was 
simply inadequate to explain the facts of language acquisition. We saw 
also that if we grant Sampson’s various assumptions, then the already 
weak case for rigid rationalism collapses. It is plausible, then, to suspect 
that the right theory of language acquisition lies between rigid rationalism 
and Popperian empiricism. In this last section I want to explore some of 
this intermediate territory. To facilitate the exposition I will begin by 
sketching Sampson’s Popperian language acquisition mechanism 1s some- 
what more detail than I did previously. I will then note four rather 
different ways in which the Popperian mechanism might be modified. 


(a) The Sampson—Popper Language Acquisition Device 
The basic features of the Sampson-Popper device can be represented as 
in Figure 2. The two central components in this picture are the hypothesis 


1 For an argument that he should be embarrassed, see Lakatos [1974]. 
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generator and the theory tester. Much of. what Sampson tells us about the 
hypothesis generator is negative. Thus we are told that ‘the hypotheses 
formulated either by a child or by an adult are not drawn from a range 
(whether finitely or infinitely large) that could, even in principle, be 
specified in advance’ (p. 197, emphasis Sampson’s) ‘For Popper,’ Sampson 
writes, 

‘it is a central belief that hypothesis- and concept-formulation are genuinely 
creative activities—not in Chomsky’s impoverished sense, according to which 
an activity is ‘creative’ provided the class of potential future instances is 
infinitely large even though definable by a finite set of rules, but rather in the 
everyday sense, according to which an activity is ‘creative’ only if future 
instances regularly fall outside any rules or norms that might be established in 
order to account for past instances.’ (pp. 197-8). 


Once a hypothesis has been produced, the theory testing component of 
the mind takes over. It is the task of this component to compare the 
hypothesis with the data provided by experience. If the hypothesis is 
found to be compatible with experience it is retained and used by the 
language processing mechanism which mediates the production and 
comprehension of speech, though the testing process continues while the 
hypothesis is tentatively accepted and used. If, on the other hand, the 
hypothesis is found to be incompatible with the data of experience, the 
theory tester rejects it and removes it from the language processing 
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mechanism. When a hypothesis is falsified, the entire process starts over 
again; the hypothesis generator is set in motion and produces a new 
hypothesis which is again tentatively accepted and used while it is con- 
tinuously tested against experience. Sampson several times suggests that 
the output of the hypothesis generator may be quite random (on the model 
of mutation in natural selection). If this is assumed, then a sympathetic 
portrait of the Popperian language learner had best add a pre-testing 
component which has some memory for past data and tests a new 
hypothesis for compatibility with this memory before passing it on to 
be tested against experience. Without such a pre-testing component, the 
system would be in danger of rejecting a falsified hypothesis, then re- 
placing it with a new hypothesis which was incompatible with data already 
provided by previous experience. 

Sampson tells us very little about the workings of the theory tester. In 
particular, he does not specify what relation must obtain between 
hypothesis and data for the tester to reject (or tentatively accept) the 
hypothesis. If we want to preserve the maximum resemblance between 
the language acquisition model and the orthodox Popperian account of 
hypothesis testing in science, then presumably the tester and the pre- 
tester would reject a hypothesis if and only if the hypothesis (along with 
other beliefs) is logically incompatible with the data of experience. 

A final point to note about Sampson’s Popperian Language learner is 
that the theory tester and the hypothesis generator are only minimally 
interactive. The theory tester can turn the generator on when a new 
hypothesis is needed, but it provides the generator with no guidance on 
the sort of theory to produce. The generator functions quite autonomously 
in deciding what theory should be offered for testing at any given time. 


(b) Four Ways to Change The Sampson—Popper Model 

Let us now consider some of the ways in which the Sampson—Popper 
model might be modified to overcome the inadequacy entailed by the 
rational scientist argument. Although I will discuss each modification 
separately, there is no reason why several of them could not be combined 
to produce a model of language learning that little resembles the one 
urged by Sampson. 


(bi) Imposing Restrictions on the Hypothests Generator 

One way to modify the Popperian learning mechanism with an eye toward 
enabling it to learn the right grammar is to impose considerably more 
structure than Sampson does on the hypothesis generator. ‘This, of course, 
is Chomsky’s recommended strategy. Rather than viewing the hypothesis 
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generator as a device whose output does ‘not belong to any innately fixed 
range’, Chomsky proposes that the hypothesis generator might be innately 
constrained so that all members of its output class exhibit a range of non- 
trivial common features. He also suggests that there may be innate 
constraints on the order in which the hypothesis generator produces 
hypotheses, thus, in effect, imposing an innate ranking on the already 
restricted output class. As we have seen, however, Chomsky’s rigid 
rationalist theory entails the existence of non-trivial universals. And 
Sampson has argued that there simply are no non-trivial universals that 
cannot be explained more simply. In any event, there is little evidence for 
the existence of the broad range of abstract syntactic universals that 
Chomsky’s view would lead us to expect. 

We should not conclude that the strategy of restricting the hypothesis 
generator is empirically untenable, however. At most what the absence 
of appropriate universals would demonstrate is that the hypothesis 
generator is not so constructed that all members of its output must share 
non-trivial common features. But this is only one very special way to 
restrict the output of the hypothesis generator. It is entirely possible to 
construct a hypothesis generator which can output only a tiny subset 
(finite or infinite) of the logically possible grammars, but a subset whose 
members share no common properties, save trivial ones.1 Thus it is 
entirely possible that the correct language acquisition model will not be a 
rigid rationalist model but nonetheless will impose strong constraints on 
the output of the hypothesis generator. 


(bit) An Inductive Theory Tester 

A second way in which a learning model might depart from the Sampson- 
Popper model is in the standards of acceptability and unacceptability 
imposed by the theory tester (and pre-tester). Popper, notoriously, denies 
that there is any such thing as induction or a logic of induction. Thus in 
the Popperian mind a hypothesis can fail only if it (in conjunction with 
other tentatively accepted hypotheses) is logically incompatible with the 
data. But despite the premature obituaries from Popperians, inductive 
logic is alive and kicking.” Thus we might construct an alternative model 
by allowing the hypothesis tester to reject theories which are rendered 
implausible by the data, where implausibility is judged by the standards 
of one or another system of inductive logic. A more radical departure 
from the Sampson—Popper picture would be to have the theory tester 
continue to test the working hypothesis until the hypothesis reaches a 


1 Cf. Stich [1978a]. 
2 For one such obituary, cf. Lakatos [1974], p. 162. 
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certain degree of confirmation, as measured by its inductive logic. When 
the cut-off has been reached, the acquisition system simply shuts down, 
and the hypothesis is subject to no further testing. 


(biii) Interaction between Theory Tester and Hypothesis Generator 


In both the Sampson-Popper model and the Chomskian model, the theory 
tester exerts little control on the hypothesis generator. This is an inefficient 
and implausible feature of both models. It is certainly possible to design 
systems in which the theory tester not only determines that a hypothesis 
is unacceptable, but also tries to analyse why. In such a system the theory 
tester would send messages to the hypothesis generator which would serve 
to direct the generator in its search for a replacement hypothesis. Of course 
this talk of directing a search for a replacement hypothesis makes little 
sense unless there is a structured space of hypotheses to be searched. Thus 
we could not have an interactive learning system if the hypothesis generator 
is the ‘genuinely creative’ random component conjured by Sampson. We 
should also note that as learning models become increasingly interactive, 
the distinction between the hypothesis generating component and the theory 
testing component begins to break down. There is no reason a priori to 
assume that these two functions can be distinguished in the mechanism that 
actually underlies human learning. 


(6iv) A Non-Rational Theory Tester 

Much recent thinking in cognitive psychology has been motivated by 
what might be called the metaphor of the rational agent. The central idea 
of this metaphor or conjecture is that a considerable range of mental 
processes might be interestingly analogous to the process whereby a 
rational agent infers to a hypothesis in science or in common sense 
reasoning. Theorists who pursue the analogy make a pair of assumptions 
about the mental processes under study. The first is that the process can 
be viewed as having an input and an output, where the input is interestingly 
analogous to data and the output is interestingly analogous to a hypothesis. 
The second assumption is that the processes mediating between input 
(or data) and output (or hypothesis) are rational inferential processes 
which are interestingly analogous to some normatively acceptable rules 
of inference. Both Chomsky’s theory of language acquisition and the 
Sampson—Popper theory clearly share these assumptions. Both theories 
view language acquisition as inference to a hypothesis (in this case a 
grammar) on the basis of primary linguistic data. And both view the 
inference itself as a rational one. The influence of the rational agent 
metaphor is not restricted to theories of language acquisition. R. L. 
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Gregory, for example, has proposed that perception be viewed as a process 
of inference to a hypothesis, where the percept itself is the hypothesis, 
and, in the case of vision, the premises are provided by retinal stimulation.* 
The visual system, according to Gregory, attempts to produce the best 
hypothesis that will account for the data of visual stimulation. When the 
system is misled, the result is a visual illusion? 

Now what has been little noted by either cognitive psychologists or by 
philosophers is that the two assumptions of the rational agent metaphor 
are separable. ‘Thus it is possible to have a mental process which fits the 
first assumption but not the second. Such a process would produce a 
hypothesis on the basis of perceptual (or other) data. But the processes 
mediating between data and hypothesis would not resemble any normatively 
acceptable or rational system of inference. Indeed, it might be analogous 
to some system of inference which would be viewed as patently irrational 
if applied in conscious common sense or scientific reasoning. Thus, in 
the case of language acquisition, we can imagine a learning mechanism 
which favours or disfavours a category of grammars on the basis of 
evidence that a rational theorist would rightly take to be irrelevant to the 
choice made, or even a mechanism that takes as evidence in favour of a 
grammar what a rational theorist would take as evidence against it. What 
is important about a language learning mechanism, from the point of view 
of natural selection, is that it should generally get the right answer; how 
it turns the trick is of less moment. And, as Quine has noted, language 
learning is ‘a put-up job’.? To get the right answer is to get the same answer 
that our senior co-linguists got sampling similar data. It matters little 
if our inferential strategies are, from a normative point of view, irrational, 
so long as our seniors are similarly irrational. They are, of course, since 
we are ‘birds of a feather’.4 

This completes my brief catalogue of modifications for the Sampson- 
Popper mind. I do not suggest that my catalogue is anywhere near complete. 
However, I hope I have said enough to establish that there is much of value 
to explore be.ween Chomskian rationalism and Popperian empiricism.°® 


University of Maryland, College Park 


t Cf., for example, Gregory [1970], pp. 30 ff. 

3 For a philosophical elaboration of Gregory’s theory, cf. Harman [1973]. 
? Quine [1969], p. 125. 

4 Ibid 


$ Earlier versions of this paper were read at the University of Lancaster and at a week-end 
sponsored by the University of Warwick at Cumberland Lodge, Windsor Great Park. 
Comments in both places led to substantial improvements. I am particularly grateful 
to Geoffrey Sampson for his comments. My research has been supported by the 
U.S.-U.K. Educational Commission and by The American Council of Learned 
Societies. 
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Relative Essentialism 
by EVAN FALES 


I Quine’s dismissal of Aristotelian essentialism! seems to have ushered 
in an era of renewed popularity for essentialist doctrines. The plausibility 
of any such doctrine rests however upon its ability to resolve a number of 
traditional puzzles, of which I shall mention four: 


(1) What is the nature of the (alleged) necessary connection between a 
thing and certain of its properties? 
(2) What are the grounds for a classification according to ‘real’ essences? 


(3) In what sense is it possible to empirically discover the real essence 
of a substance or substance-kind previously identified? 


(4) What is to be done about borderline cases that refuse to fit un- 
equivocally into any (putatively) natural classification? 


With respect to nominal essences, a fairly clear answer to (1) can be 
given in terms of the force of linguistic convention; but with respect to 
putative real essences, (1) poses a difficult problem. The response to (1) 
may be varied, depending upon what sort of thing and essential property 
are being considered. One suggestion has been that for at least some 
essential properties of material objects, the answer is to be sought in an 
understanding of the physical laws which govern those objects.” I shall 
argue that this claim is partially right but oversimple. Similarly, one 
response to (2) is to suggest that it is scientific theories which provide 
the ultimate grounds for non-conventional classification. It might further 
be expected, if this is correct, that philosophical analysis of the structure 
of such theories might yield some insight into the nature of this grounding, 
and into the nature of substantial change. But to the best of my knowledge, 
very little has been done to elucidate in any detail how it is that theories 
determine taxonomical structures (if they do). I hope to show that an 
investigation of this problem sheds considerable light upon questions (1), 
(2) and (4). 

Received 16 June 1978 
1 Quine [1963], p. 156. 

2 See for instance Irving Copi [1954]. 
3 Except, to some extent, with respect to biology—where, for reasons that will presently 
emerge, I believe a precise solution to the problem to be both hopeless and gratuitous. 


Cf. for example, David Hull [1965]; Michael Ruse [1969]; and Berent Eng [1976]. Some 
suggestive more general remarks are made in Harré and Madden [1975]. 
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Question (3) has received considerable attention. The locus classicus is 
Locke’s Essay'; recent writers, prominently Donnellan [1962] and [1972], 
Kripke [1972], and Putnam [1970] and [1973], have I believe provided a 
generally adequate direction for an answer to it by showing that reference 
to an individual or species can be fixed independently of knowledge or 
specification of at least some essential properties, so that the necessary 
connection between a thing or species and those properties cannot be 
analysable in terms of some conventional use of the word which designates 
that thing or species. Since I accept this already much discussed solution 
in its main outlines, I shall not discuss it here. 

I shall deal with just one essentialist thesis, viz. the thesis that there 
are natural kinds. With respect to this thesis, I shall first note that f certain 
scientific theories determine a taxonomy for objects falling within their 
domain, then that taxonomy is grounded in empirical reality (if those 
theories are true) and not merely in conventions. Secondly, I consider 
the problem posed by question (4) for such putative taxonomies, and 
show how this puzzle, which exercised Locke, can be solved for a certain 
class of theories. Finally, I consider the question whether—and how— 
theories of the requisite sort do determine taxonomies, t.e., how putative 
natural-kind classifications are parasitic upon putatively true theories and 
are, like the theories themselves, subject to revision under pressure from 
empirical information about the world. The first and third parts of the 
argument constitute an attempt to answer question (2), and to develop 
an approach to the treatment of question (1). I conclude that certain 
theories do determine taxonomies, and that therefore there are some 
(possibly unknown) natural kinds. Thus, what natural kinds we believe 
there to be at any time will be relative to whatever most fundamental 
theory is at that time believed to be true; and what natural kinds there are 
is determined by whatever such theory actually is true. 


2 To begin, let me assume what later needs to be shown, that certain 
scientific theories do determine, or at least place strong constraints upon, 
the taxonomy of objects within their domain. To hold this one need not 
deny that other taxonomic divisions of the same range of objects can be 
supplied; nor that, relative to some purposes other than that of scientific 
explanation, such alternative taxonomies might be more ‘natural’. But 
explanation-dependent classifications hold a special position, for those 
classifications are natural in the special sense that, insofar as theory 
reflects the way things are in the world, independently of human purposes, 
theory-based classification will also reflect the way things are, independ- 
1 Locke [1959], book III, chapter VI. 
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ently of such purposes. If, then, theory determines, in some non~arbitrary 
way, a theory-relative classification of objects in its domain, and if the 
form of theory itself is determined by empirical evidence, then to that 
extent theory-relative classifications are not creatures of convention or 
arbitrary decision.1 

If, furthermore, there is some theoretical description of the physical 
world that is both true and ultimate, then (whether we have or ever shall 
discover that theory or not) it follows that there is some classification of 
objects in the physical world in terms of their real essences, for such a 
classification will mark the distinction between those sorts of changes that 
constitute a transition from one natural kind to another (hence a substantial 
change) and those that do not, in terms of the only sense we can give to 
the notion of a natural kind. I shall assume without argument the truth 
of the required premises that scientific theories have empirical content— 
hence are not true merely by conventional stipulation or definition, and 
that there is some ultimate true theory. 

It remains to be shown that a theory-based taxonomy can yield a suitable 
division of objects within the theory’s domain into species. There has 
been a tendency to cite biological species and the chemical elements as 
paradigms of natural kinds. This choice of examples ignores a problem 
posed by Locke: if the properties by virtue of which something belongs 
to a natural kind are necessarily instantiated by it, then how are we to 
account for borderline entities (and for gradual substantial change)? 

I believe this difficulty cannot be ignored but can be outflanked. I shall 
consider here only classes of theories of a certain type, viz. those that 
form reductive hierarchies, such that 7; reduces to T; if the domain of 
T, is objects which compose, t.e. are the material parts of, the objects in 
the domain of T°, and where the properties and laws obeyed by the former 
objects are used to explain the properties and laws obeyed by the latter. 
Any actually proposed hierarchy must be finite in length; hence there 
must be a theory which is its bottom-most member, the theory to which 
all others in the hierarchy can be reduced. I shall designate such a theory, 
which is (perhaps only temporarily or provisionally) the most fundamental 
one, ‘Tr. 

Now my first claim is that the ultimate or ‘smallest’ parts, which 
constitute the domain of T,, must be classifiable in a way which escapes 
the difficulty raised by question (4}—that is, there cannot be any borderline 
entities under the classification provided by T',, on pain of forfeiture either 
of T,,’s claim to truth or T,,’s claim to genuine fundamentality. Whatever 


1 Thus the strength of my thesis depends on the extent to which descriptions of the data 
and theories determined by the data are independent of conventional choice. 
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the most fundamental entities (hereafter ‘FEs’) prove to be, they can, 
however, be assembled in law-constrained ways into the complex physical 
structures which constitute the domains of the theories reducible to Tz; 
and the classification of these derivative entities (or ‘DES’ as I shall call 
them), as determined by their respective non-fundamental theories, can 
admit of borderline cases not neatly classifiable by any such taxonomy. 
This fact, however, does not threaten essentialism, for so long as the 
existence of all such borderline cases among DEs can be explained in 
terms of a reductive theory T, which is not plagued by instances of 
borderline FEs, the failure of perfect classification at the derivative level 
is neither theoretically crippling nor a challenge to the thesis that there is 
a division of the physical world into natural kinds. This strategy outflanks 
(4). 

For Locke, indeed, the difficulty of classifying objects unambiguously 
into natural kinds is closely connected with his supposition that tf objects 
have real essences, then those essences are determined by their sub- 
microscopic constitution.+ For every manifest difference between objects 
must ultimately depend upon some difference in their microstructures, 
but either every difference in microstructure entails a difference in real 
essence—in which case it is unlikely that any two macroscopic objects 
will ever be of the same natural kind—or else there is no basis for classifica- 
tion except the arbitrary choice of some differences as important and 
others as not. However, the latter choice amounts to the stipulation of 
nominal essences. So Locke’s recognition of the requirements of reductive 
explanation leads him to a requirement for natural kinds which, for DEs, 
seems hopelessly stringent. Clearly, if the difficulty is to be avoided, the 
link between classification and the explanation of property-differences 
must be somehow loosened. But, whatever rough classifications of DEs 
scientific theory may nonetheless provide, the essentialist may concede 
Locke’s difficulty if it can be shown that FES fall under natural kinds— 
which provides precisely the required loosening. 

It will be useful here to introduce some terminology. A putatively 
terminal theory T, may yield its place to some other theory for either of 
two reasons: either T, is rejected in favour of some alternative theory, or 
else Ty is retained but reduced to some more fundamental theory. In the 
latter case, the putatively elementary entities lose their status as FEs and 
are taken to be DEs in the domain of the reducing theory. So FE status 
for a class of particles is always provisional. If indeed the most fundamental 
theory accepted at a given time is intrinsically such as to require either 
modification or further reduction, then the non-ultimacy of its FEs will 

1 Locke [1959], book ITI, chapter VI, sections 8 and g. 
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be ascertainable from features internal to the theory. It may happen, on 
the other hand, that T, has no intrinsic features that either force or forbid 
our anticipating some still more fundamental theory. I shall call theories 
of the former sort unsatisfying theories, and theories of the latter sort 
satisfying theories.! It is of some interest to determine what criteria a 
theory must meet in order to be satisfying and hence a candidate for 
genuine fundamentality. Certain such criteria will be proposed in this 
paper. 

There is a second distinction I shall use. Let the set of possible states 
of affairs permitted by the laws of a theory T independently of any 
boundary conditions or antecedent states of affairs be called the set of 
unconditional possibilities relative to T. And let the possibilities at a time ft 
permitted by T together with specification of the relevant prior circum- 
stances, be called the pre-conditioned possibilities at time # relative to T. 

Now the aberrancies which provoked Locke’s challenge to essentialism 
were in every case DEs. What I shall try to show is that either our most 
fundamental theory T, generates an aberration-free taxonomy of its FEs 
or else T', is an unsatisfying theory, and hence not genuinely fundamental. 
This may seem as follows. The taxonomy generated by the laws of T p must 
be such that 


(a) for every natural species of FEs (as defined by that taxonomy), every 
member of that species must obey the same set of laws—+.e., must 
be such that the unconditionally possible states of affairs into which 
it can enter would not differ qualitatively under substitution of 
another member of that species; 


(b) this set of laws is distinct for each natural species; and 


(c) every entity in the domain of T, is either a composite entity or else 
an FE which falls under one of the natural species of Ty's taxonomy. 


Now suppose there were an aberrant FE—that is, a putatively non- 
composite entity in Tps domain whose behaviour was not governed by 
the laws governing any of the recognised species of FFs. Then either the 


1 Thus for a theory to be satisfying is not incompatible with the possibility of discovering 
facts which would militate for reduction to some deeper theory. In the characterization 
of unsatisfying theories, I have used the notion of ‘intrinsic’ features of a theory; this 
deserves clarification. A (trivial) intrinsic characteristic which would render a theory 
unsatisfying would be logical inconsistency, but other aspects of the form of the laws 
of a theory can also do 80. The intrinsic features of a theory contrast with ‘extrinsic’ 
reasons for accepting or rejecting it—iz. conformity or lack of conformity with the 
empirical data. However, as is well known, the distinction drawn in this way will not 
be sharp, for a lack of match-up with the data can often be provisionally accommodated 
by a theory, for instance by making certain of its laws defeasible—+t.e., in this case by 
changing its intrinsic characteristics. My argument will be that this manoeuvre is bought 
at the price of rendering the theory unsatisfying, though not necessarily false. 
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aberrant FE must be recognised as the first discovered member of an 
entirely new natural species, or else the deviance from other FEs must 
be explained by postulating some substructure which is altered from the 
‘norm’ in the aberrant FE. (For one cannot simply countenance the 
existence of such a singular particle as an FE which does not obey the 
laws associated with any known sort of FEs; and if the FE is law-abiding, 
then it must be possible for there to be others like it and laws to match.) 
But recourse to the former manoeuvre minimally requires adding a new 
fundamental class of entities and new laws to 7',—including laws govern- 
ing the interactions between the new particles and the other FEs of the 
theory; it may require much more extensive revision of Ty. For, first, 
T, will characteristically predict the existence of certain kinds of particles 
and exclude the existence of others; and secondly, if the DEs of the world 
are now shown to contain Fs of a novel kind, then many of our ex- 
planations of the behaviour of those DEs may have to be reworked. 
Alternatively, recourse to the latter strategy robs T, and members of its 
taxa of their fundamentality. In either case, the existence of an aberrant 
FE renders T, unsatisfying. Conversely, if T, is satisfying, then every 
known elementary particle must be properly classifiable by the taxonomy 
of T'}. Thus, the explanatory adequacy of a fundamental theory with 
respect to its DEs is necessarily linked to taxonomic adequacy with respect 
to its FES; but taxonomic difficulties with respect to DEs do not necessarily 
imply explanatory inadequacy. 

If Locke was looking for natural kinds, he should therefore have looked 
to fundamental particles.1 Differences between members of a derivative 
kind will indeed reflect differences at the level of microstructure; but such 
differences need not be construed as differences in essence. At the same 
time, it is entirely consistent with this proposed solution to leave open 
the possibility that certain ranges of DEs may be at least roughly classifiable 
in terms of some theoretically motivated taxonomy. 

With respect to FEs however, our position bears a distinct resemblance 
to Locke’s: individual members of a fundamental species must resemble 
one another in the strong sense specified by condition (a) above. It is 
however necessary to investigate the import of (a) in more detail. For 
obviously FEs must have some contingent properties, and these can differ 
for FEs of the same kind or for a given FE at different times. We may ask, 
what is the theoretical role of the contingent properties of FEs, and how 
does T, enable us to distinguish between contingent and non-contingent 
1 Locke may perhaps be excused on the grounds that no developed micro-theory was 


available in his day, and on the grounds of his (ill-founded) scepticism concerning the 
possibility of such knowledge. 
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properties? This takes us to the second problem of this paper. For unless 
it can be shown how T, determines the fundamental sortals, the proposed 
solution to the first problem must fail as well. 


3 The theories we have been considering are ones which explain the 
properties of DEs by reference to the properties of FEs. FEs will be such 
that none of their properties are (provisionally) subjected to further 
reductive explanation. But if some of their properties are contingent, then 
we may properly ask why a given FE has the contingent properties it 
does have at a given time, and not others. Such explanations cannot be 
reductive, but must rather be of a causal sort, involving appeal to the 
laws of T, and information about some prior state of the system. If T, is 
to be a satisfying theory, then every physical property of an FE must either 
be such as to admit of genetic explanation in terms of laws of T, and 
other states of affairs, or else be ‘given’ in a sense not requiring further 
explanation. 

Some of the contingent properties of FEs will be determinate values of 
determinable properties which are not contingent properties of those FEs. 
Thus the particular spatial location and velocity of an FE at a given time 
are, for current physics, contingent properties of FEs; but the property 
of having a spatial location! and a velocity (both relative to any coordinate 
system) are not contingent. Those properties of FEs which admit of 
explanation of terms of T, may be either relational, such as spatial position 
and velocity, or non-relational, such as spin. 

We may now define a further pair of notions, that of simple and non- 
simple properties of FEs. A property of an FE is non-simple if and only 
if the explanation of that FEs having it can be given in terms of other 
properties of the PEs of Tz; and non-simple otherwise. We may now 
formulate our question as: What kinds of non-simple and simple properties 
could the FEs of a theory T, have, if T; is to be satisfying? It follows from 
our definitions that if T; is satisfying then the simple properties of its FEs 
do not intrinsically require any further reductive explanation, and that 
since the contingent properties of an FE must be explainable, they must 
be non-simple properties. 

The relationship between the attribution of contingent properties to 
FEs and explanation requires analysis. For obviously, such attributions 
will play an essential role both in explanations of the properties of DEs 
and in explanations of the non-simple properties of other FEs and of the 
same FEs at later times. So it seems that the manner in which theories 


1 ‘Spatial location’ must here be interpreted broadly enough for particles to accommodate 
the modifications imposed by quantum mechanics. 
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determine classification relative to explanatory requirements is not entirely 
straightforward. It is not sufficient, therefore, to merely assert that 
theoretical considerations somehow ground the doctrine that there are 
natural kinds; one must be able to show kow such a taxonomy is determined 
by theory. A theory-determined distinction between sortal-determining 
and contingent properties of FEs must, in particular, be a function of a 
distinction in the manner in which those properties respectively play a 
role in theoretical explanation. 

Now exact surface-regularities in the world are few and far between. 
How can one hope for simple theories in the face of the immense fecundity 
of nature? The Greek atomists ingeniously saw in the combinatorial 
possibilities of atoms the kind of conceptual manoeuvre which provides 
a particularly pure form of the strategem needed to effect theoretical 
potency. Some of the properties of atoms are permanent; some of them 
are ephemeral. Ephemeral properties are required in order that the 
combinatorial possibilities can be realised; permanent ones in order that 
these possibilities can be defined, that is specified and restricted. In this 
way the structure of the theory provides richness without promiscuity. 

Among the permanent properties of the FEs of such a theory will be 
those which are collectively sufficient to determine the possible microstates 
which are unconditionally permitted by the theory. The claim that this 
must be so can be argued as follows. A theory T, cannot be satisfying 
unless any changes in the FEs of T; are subsumable under some laws 
of Ty. In general we must consider two types of changes—substantial 
changes and identity-preserving changes in PEs. Whether a given change 
counts as the one or the other will depend upon how our classification 
is determined with respect to the property(s) appearing or disappearing 
in the change. If any such property is a permanent property of a given 
FE, X, relative to our classification, then the change is substantial; other- 
wise not. Thus relative to that classification, permanent properties are 
essential and ephemeral ones are contingent. 

Now an explanation of a law-governed change in X using T, may make 
appeal to both permanent and ephemeral properties of X and/or other 
FEs. These taken together determine those states which are the pre- 
conditionally possible outcomes of the change in X. If the entire range 
of unconditionally possible states is to be determined, constraints upon 
the range of possible ephemeral properties and upon changes in ephemeral 
properties must be governed by some set of permanent properties. For the 
laws governing such changes are universal; and as such, they imply that 
there must be something permanent in the nature of those FEs to which 
they apply. Where changes involving ephemeral properties can be ulti- 
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mately explained by appeal to laws and hence to some underlying 
permanent properties, we may say that the ephemeral properties are 
derivative. 

I conclude that if a theory admits of FE-states whose unconditioned 
possibility is controlled by non-derivative ephemeral properties, then that 
theory will be unsatisfying. In general, if we have a change from one 
microstate to another involving some or all of the same individual FFs 
then we may expect some of the permanent properties of those FES to 
determine what changes from that microstate are possible.+ 

If, in other words, our theory is to be predictive, we must have some 
restrictions upon the accessthility of any given microstate e; from any 
prior microstate e, If all the unconditionally possible microstates are 
specifiable, this information can be displayed in a two-dimensional matrix 
[e;, €;] whose entries specify the probability of a transformation from any 
state to any other state. For a predictive theory, not only the uncondition- 
ally possible states, but also the accessibility matrix will have to be 
deducible ultimately from the permanent properties of the FEs.? 

The appearance of an ephemeral property must be controlled by some 
law(s) L. But the parameters in terms of which L is formulated must 
include variables whose values are filled by some other ephemeral proper- 
ties of FEs, for if all of these were values of permanent properties, then 
the presence of the derivative property could not but be time-independent. 
For a causal theory, the ephemeral properties which figure in the deter- 
mination of the presence of some microstate in accordance with L will be 
prior ephemeral properties of some set of FEs. Thus the ephemeral 
properties of a system of FEs are non-simple, but the sense in which they 
are explained requires the importation of information about other non- 
simple, ephemeral properties of the system. T'he determination of later 
ephemeral properties from earlier ones must, however, be controlled by 
the permanent properties of the FES of the system. 

On this analysis, the least some of the permanent properties of an FE 
will determine what its possible relationships and interactions with other 
FEs of specified sorts could be, given any possible antecedent circum- 
stances. Hence, if we know that the world contains a certain set of entities 
of certain kinds, we can, knowing their relevant permanent properties 
and associated laws, in principle determine the possible states of any 


1'This point is independent of any view about whether microphysics is or is not deter- 
ministic, so long as it is a predictive theory in some sense. 

2 The transformation-probability is properly a relation between the micro-states rather 
than between their component FEs, taken individually or collectively. But this relation 
must obtain between the states derivatively—that is, in virtue of some properties and 
relations which could be attributed to the individual FEs involved in those states. 
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system, independently of any information about further contingent 
matters, such as the actual present state of that system. 

The general form of explanation in terms of a fundamental-level theory 
of microevents or states will therefore involve determinations of the 
properties of such states which depend both upon the permanent properties 
of the FEs involved, and upon certain prior ephemeral circumstances 
concerning them. Now just as there might be permanent properties of 
FEs which are explanatorily derivative (in the sense that they can be 
accounted for in terms of simple properties), so too it may be asked 
whether there is some minimal set of ephemeral properties which are 
required in addition to permanent ones in the calculation of preconditioned 
possibilities, such that the resulting set 1s sufficient to explain any other 
ephemeral properties falling within the purview of the theory which the 
system has at the time in question. 

Suppose we have a closed system W containing a specified set of individual 
FEs at some time tọ The specification of some of the simple properties of 
these FEs, and some of their ephemeral properties at fọ should allow us to 
generate explanations for the presence of the other properties of Wand of its 
FEs at ty and at later times, if the theory is predictive. What is being sought, 
then, is a minimal set of simple and ephemeral properties such that their 
specification at any time is sufficient to explain any of the other properties of 
W which the theory purports to be capable of explaining. In other words, we 
are looking for a minimal set of properties such that their specification will 
identify each of the unconditionally possible states of W with respect to 
the characteristics relevant to their lawlike behaviour, and hence sufficient 
to determine what future states are accessible from any given one. I shall 
call any such set of properties a base set for WA It is essential to recognise 
that the determinate values of the ephemeral properties required for the 
base set cannot be determined solely from a specification of the simple 
properties of the FEs of W, though the simple properties may be sufficient 
to determine what determinable properties (e.g., spatial location) are such 
that specification of their ephemeral determinate values will suffice to 
provide a full base set for W. 

How should someone who holds a theory T; proceed in classifying the 
‘FEs of that theory? The obvious move is to classify them according to 
their simple properties, for, insofar as theoretical considerations are at 


1 Should we discover a putative FE with a property which is neither contained in the base 
set nor explainable within the framework provided by the postulated base set and the 
relevant laws, then clearly the theory itself will be in jeopardy, and so too (in general) 
our classification of FES. Hence it will not in general be possible to decide—short of 
constructing a new theory—whether the property in question should be construed as 
contingent or permanent. 
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stake, the ‘natural’ classification procedure will be the one which allows 
us most simply to formulate in universal terms the fundamental laws of 
the theory. For since T ps laws and explanations depend upon what simple 
properties the relevant FEs possess, universality can only be achieved if 
the kind-nouns in terms of which they are identified in those laws group 
them according to their simple properties. This may perhaps seem trivial, 
but what is not trivial is the choice of properties involved, for the matter 
of constructing a suitable base set is not trivial. 

What sortals, then, should Ty countenance? Since, for a satisfying 
theory, the permanent properties of FEs should control the range of 
states admitted as unconditioned possibilities by T, there is a connection 
between what sortals are chosen and the explanatory power of the theory. 
But thus far, the argument has only shown that the assignment of perma- 
nent or ephemeral status to a property depends upon the requirements 
imposed by the universal laws of the theory. Might there not be, however, 
some arbitrariness in the way these laws can be formulated; in particular, 
an arbitrariness which permits, under some suitable reformulation of the 
laws, the conversion of any permanent property to an ephemeral one, or 
vice versa? Compare the following two accounts: 


I. Scientist A says that individual FEs of type S have an ephemeral 
property P., such that under some conditions an individual X of this 
type will have P}, and under others it will have some other property Ps. 
Whether X has P, or Pp, according to A, can be explained in terms of some 
permanent property P* which things of type S have, together with other 
information concerning X’s past or present contingent properties. 


IT. Scientist B, by contrast, claims that there are two types of FES, 
S, and Sp such as they have identical permanent properties, with the 
exception that Ss have P} as a permanent property, whereas S,s have 
P,. Moreover we may suppose that B agrees with A as to the explanatory 
relevance of P*, though B will construe the situation as one in which an 
individual X, of type S, annihilates into an individual X, of type Sa and 
in which it is X,’s having P® (and perhaps also X,’s having P*), together 
with the other information, which are to be referred to in explaining the 
transition from X, to X}. 


Both of these alternatives seem equally admissible: certainly, there does 
not seem to be any obvious reason why the explanatory utility of reference 
to P* should hinge on conventions regarding whether we have individuated 
one changing individual with that property, or two unchanging individuals, 
each with that property, such that one appears when the other disappears. 
It is precisely this difficulty that has plagued Wiggins, Hirsch, and others 
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in their effort to make out the distinction between substance-sortals and 
phase-sortals.! It looks, roughly, as if we can freely trade ephemeral 
properties in the base set for ephemeral individuals, and vice versa. But 
the consequences of permitting symmetry between these two strategies 
are hardly acceptable. Suppose we were to reconstrue every determinate 
property among those in the base set as a simple, permanent property; 
correlatively, what were for A cases of property-change in FEs are 
reconstrued by B always as cases of substantial change. The result would 
be that every distinguishable state of the world (or of whatever domain 
within the, world is governed by the theory) that is describable at the 
level of the FEs of the theory will be found to contain different FFs 
(indeed, an entirely new set of FEs, if permanent relational properties are 
admitted which can relate any two or more individual PEs). Not only 
would every distinguishable state be comprised of different individuals, 
but it would be comprised of different hinds of individuals; the ephemeral 
properties which had been available for the individuation of members of 
a kind would now be pressed into service to distinguish between kinds. 
So construed, our world would be a Heraclitean one. 

The mistake in allowing this lies in the incorrect reasoning that 
contingent properties in the base set can be exchanged for simple properties 
(for with respect to taxonomy, permanent properties will specify species; 
ephemeral properties are accounted contingent). It has already been 
argued that ephemeral properties must be non-simple; hence an asym- 
metry exists, with respect to explanatory requirements, between the 
contingent and the simple members of the base set of properties. It is 
not B’s addition of superfluous entities to A’s ontology which is being 
objected to here, but rather that the specifying properties in terms of which 
entities in B’s classtficatory scheme are identified include ones which have no 
fundamental explanatory role with respect to determination of the uncondi- 
tionally possible states of the system. 'Thus, for instance, the identification of 
X, as being an individual of type S, involves ascribing X, a specifying 
property P, which is not simple.* Hence, the identification of X, requires 
more information than is needed for the purpose of whatever explanations 
entities of type S, will have a role in. So explanations under B’s classificatory 
scheme will involve reference to things under descriptions which are at 
least in part explanatorily otiose, as regards the determination of un- 
conditioned possibilities and the calculation of the accessibility matrix. 

1 Cf. David Wiggins [1967] and E. D. Hirsch [1976], pp. 20-2. For a criticism of Wiggins’ 
see Gerald Vision [1970]. 
* Pı must be a defining property for things of type S (in addition to being a permanent 


property) because it is ex hypothesi the only property in virtue of which S s are dis- 
tinguished from S\s. 
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We may observe that an objection can also be made against a classifica- 
tory scheme which differs from A’s in classifying FEs by dropping some 
of the properties which are used in A’s scheme for specifying some kind. 
For now—assuming of course that under this manoeuvre we wish to 
retain essentially the same theoretical apparatus which A and B share—it 
will be impossible for us to provide a replacement for some portion of A’s 
theory that is stated in terms of general laws, such that the properties 
of the kinds of individuals mentioned will, in virtue of their being of those 
kinds, be the ones required for explanatory adequacy. Instead, un- 
conditional theoretical possibilities of a system would be determined in 
part by some of its contingent properties. 

Indéed, B’s trading of contingent properties for ephemeral individuals 
must also result in a loss of generality in his laws, insofar as these are 
formulated in terms of his sortals. Thus, suppose B construes spatio- 
temporal location with respect to some reference frame as an essential 
property of FEs. Changes of position are construed by him as cases of 
substantial change. Now if physical behaviour is independent of spatial 
coordinate values per se, A will be able to formulate his laws so that they 
are universal with respect to spatial location; B can do so only by denoting 
individuals by means of some location-independent sortal(s). But in doing 
so he is formulating his laws in a way which is covertly parasitic upon A’s 
classification scheme, for the new sortals which are introduced will be 
formulated in terms of the same considerations which lead A to count 
location as a contingent property. 

The argument might be thought therefore to force one to the other 
extreme; viz. to the conclusion that if T, is satisfying, its PEs themselves 
must be permanent, and not admit of substantial change. This conclusion 
might seem equally counter-intuitive. Certainly, there are processes— 
perhaps fundamental ones—referred to in physical theory which have 
been construed as involving substantial change. Decay and annihilation 
events such as pair-production among the so-called elementary particles, 
which are permitted in relativistic quantum mechanics, are sure to invite 
this objection. One could of course attempt to explain such events by 
looking for permanent subentities whose rearrangement accounts for the 
disappearance of, say, an electron and positron, and the appearance of 
gamma rays. But to do so is to admit that the latter entities are non- 
fundamental. One needs, in other words, to make provision for the logical 
possibility that the most fundamental theory could be of what is known 
as the ‘bootstrap’ type, according to which there is no unique set of 
elementary particles which compose all other forms of matter and cannot 
themselves decompose. If such theories were unsatisfying, then onl 
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theory which postulated indestructible FEs could be genuinely funda- 
mental. It may turn out of course that our world’s FEs are indeed 
indestructible. But my argument does not force any such strong conclusion 
on a priori grounds. | 

Certainly, which new particles may be produced in any elementary 
reaction must be determined by some selection-rules, and in this sense 
at least, their simple properties are genetically explained. These selection 
rules are constrained by considerations depending upon the conservation 
of certain physical quantities. Such conservation principles are typically 
derivable from particular symmetries, and these symmetries in turn bear 
upon the sense in which the laws governing the behaviour of a system 
can be universalised. (Sometimes they are taken to be a clue that some 
permanent subentities persist through and underlie the transformations 
of the current putatively fundamental entities: e.g. the hypothesis that 
baryons are composed of quarks.) The importance of, such symmetry 
considerations is particularly well illustrated by the central role they play 
in theoretical discussions concerning the fundamental level of physical 
theory. In the case at hand, they forbid events of a type in which the 
annihilation of a single isolated particle results in the creation of a single 
particle of a different type. In cases where two new particles are created 
by spontaneous decay, we have of course independent grounds for 
describing the event as one involving destruction and creation of FEs. 
So, in general, the fact that some properties of FEs are permanent features 
of those FEs does not entail that the FEs themselves are permanent.+ 

The matter of the relationship between symmetries and the variables 
over which fundamental laws may be universalised proves to be of central 
importance. For example, the fundamental laws describing all ‘isolated’ 
physical systems are, as far as we know, invariant under replacement of 
the time-coordinate variable ¢ by a different variable, #, related to z by 
the expression, ¢* == t+ At, where ‘A?’ denotes an arbitrary time-interval. 
Invariance of the laws under this transformation is equivalent to the 
statement that the physical behaviour of the system is independent of the 
time-coordinates one uses in describing it; if two isolated systems differ 
initially only in that one exists later than the other, their physical behaviour 
will be identical. It follows that the set of unconditioned possibilities is 
on this principle preserved under the passage of time. 

The invariance of the equations of motion under such a transformation has 
1 At relativisitic energies, the production of virtual particles by FEs forces a drastic 
revision of the sense we can attach under current theory, if any, to the notion of per- 
sistence through time of individual FEs. Here I am not concerned with that issue, but 


with the issue of changes which are cases of substantial change because they involve a 
change of natural kind. 
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a physical interpretation. The same laws apply independently of when 
an event occurs or of conventions regarding when to ‘start the clock’.} 
This in turn implies that differences in assignment of temporal coordinate 
number are not theoretically relevant grounds for marking off kind- 
differences among the FEs of the theory. If they were, then two observers 
who happen to have started their clocks non-simultaneously will not be 
able to agree about what kinds of FEs the system can be described as 
containing. Secondly, it can be shown by Neother’s Theorem that 
corresponding to each symmetry there is a physically significant quantity 
which is preserved by the system. 

But some of these conserved quantities are not in fact strictly conserved ; 
correspondingly, the matching symmetry is partly ‘broken’. This too has 
consequences for the classification of the FEs of physics. For example, 
consider symmetry under rotations in isotopic spin space, which is broken. 
Physically, the symmetry corresponding to isospin arises from the fact 
that certain types of FES (certain sets of baryons) are identical to each 
other with respect to the sorts of interactions they have with other FEs, 
except with respect to a certain class of interactions which depend upon 
differences in their electric charge. The principal coupling of these 
particles to others is controlled by the strong-interaction field, which is 
much stronger than the electromagnetic coupling controlled by charge. 
Thus, strong interactions in which, e.g., a pion, is involved, will be 
relatively unaffected by interchanging that particle with another pion, even 
if the latter is differently charged. Such exchanges correspond to the 
group of rotations in isotopic spin space. The invariance under a trans- 
formation corresponding to such an exchange of particles is broken by 
the minor effect of changes in electric charge. This breaking of the 
symmetry introduces a ‘splitting’ of the FE-type ‘pion’ into a multiplet 
corresponding to the three different possible charge-states. The multiplet 
would ‘degenerate’ into just one pion if electromagnetic interactions did 
not at all affect the behaviour of pions. (Further splitting of the multiplet 
into a supermultiplet of eight states is in fact required when further kinds 
of interactions are considered.) 

This is an (approximate) symmetry which reflects the fact that it is 
(almost) physically immaterial which of several fundamental types of 


1 It has been suggested that this might not be so: that, e.g., the velocity of light or the fine 
structure constant are time-dependent. This would imply violation of the law of energy 
conservation. Invariance under spatial translation and rotation similarly imply con- 
servation of momentum and these coordinates must be counted as contingent properties 
of FFs. So variations in spatial and temporal coordinates in accordance with constraints 
imposed by the permanent properties of the FEs of a system will generate a subset of 
the unconditionally possible states of that system. Moreover these parameters will 
be employed in individuating particular individuals of a theoretical kind. 


BL 
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particles one uses in an interaction of any type. Moreover, the potential 
degeneracy of the multiplet shows that the behaviour of any system will 
remain unchanged under the substitution, for any fundamental particle 
of a given kind in it, of another particle of that same kind, provided that 
the new individual is given the same values for contingent parameters that 
the old one had.! There results a permutation symmetry which leaves 
systems unchanged under permutation mappings among individual FEs 
of the same kind. This, then, is a case in which the laws governing inter- 
actions are universalised over individual FEs. It should not be surprising, 
therefore, that one of the dominant activities in elementary particle 
physics is the search for exact and inexact symmetries, and the utilisation 
of these symmetries as a guide for the construction of fundamental particle 
taxonomies. 

The upshot of this, in physics, is that physical particles will be dis- 
tinguished at the fundamental level into types in just such a way as to 
precisely reflect the systematically discriminable interaction-governing proper- 
ties which they display. Two particles will be considered of the same type 
if and only if they are identical with respect to what sort of causal relation- 
ships they can have with the rest of the world. Contingency in turn reflects 
the directions in which universalisation over physical systems generally 
is possible. The sense of ‘can’ in the penultimate sentence is given by the 
range of situations obtainable by varying the values of the parameters 
which designate these contingent properties. Symmetries always reflect 
a ‘dimension’ along which universalisation over some range of cases is 
permissible, and hence imply a universal law, ff entities to which that 
range is relevant are marked by an appropriate kind-criterion. Contingent 
properties mark off the individuals over which universalisation occurs. 

It might be argued that some of these symmetries—in particular those 
connected with spatial and temporal transformations—are a priori truths. 
I shall not be concerned here to argue one way or the other as to their 
epistemological character. Rather, I hope I have shown the intimate way 
in which commitment to a theoretical position legislates in matters of 
classification when it comes to the FFs of that theory. What sort of 


1 E.g., initial position and momentum. Where splitting does not occur, particle-exchange 
symmetries are, therefore, exact. Since, because of the Uncertainty Principle, the actual 
trajectories of individual particles such as electrons cannot be discriminated in systems 
in which their energy-states are well defined, spatial location sometimes cannot be used 
to individuate particles which are symmetric under permutation. This fact imposes 
restrictions upon the conditions under which it is physically meaningful to say that there 
are n rather than m particles of a given type in a system. These conditions on the possi- 
bility of counting determine the allowable number of FEs filling a given state, and in 
turn determine some of the manifest properties of the macroscopic objects composed of 
those FEs. (Cf. the discussion of the Pauli Exclusion Principle in Luciano Fonda and 
G. C, Ghirardi [1970], pp. 129-35.) 


Relative Essentialism 365 


universalisation is permitted us in our description of the world within the 
framework of a particular theory may be an empirical matter, but the 
ways in which we can universalise will be reflected in the symmetry 
conditions and laws on the one hand, and in the classification, for theo- 
retical purposes, of the particulars whose behaviour those laws govern, 
on the other. Thus, there may not be anything we can say a priori about 
what sorts of properties will be (under every conceivable theory) construed 
as contingent properties of FEs, and what sorts of properties as essential. 
We can, however, say that the ‘dimensions’ of universality which a given 
theory incorporates entail corresponding restrictions upon the classification 
of the FEs of that theory. We may conclude, then, that not just any 
classification of entities within the domain of a theory will be perspicuous 
for the purpose of formulating that theory. 


4 Let us now consider what light, if any, the preceding discussion sheds 
upon the first-listed (and I think metaphysically the deepest) of the puzzles 
with which this paper began. I have thus far argued that there are 
theoretical grounds for affirming a genuine distinction between the 
contingent and the essential properties of putatively fundamental entities. 
In approaching that issue I deliberately used the less tendentious terms 
‘permanent’ and ‘ephemeral’ with respect to properties instead of the 
terms ‘essential’ and ‘contingent’. But within the present context, these - 
amount to the same distinction. For, first, the question “What is it?’ is 
answered for the physicist by citing those properties of a thing that are 
the most fundamental ones to an explanation of its nature and behaviour; 
and it is just these therefore in terms of which the relevant taxonomy of 
FEs is formulated. And, secondly, to say that a property of such a thing 
is a permanent property of it is to say that it must continue to have that 
property throughout the course of its existence—that is, as long as this 
item with #his physical nature continues to endure. So, for it to cease to 
have that property is for it to cease to be. Conversely, a thing can survive 
changes in its ephemeral properties.! But what is the nature of the 
necessary connection between the essential properties and the FEs 
themselves? : 

An initial point may be made by approaching the question of the 
universality of essentialist claims about, FEs, from a slightly different 
direction than that employed in solving Locke’s puzzle about aberrancies. 
The fact that there are aberrant DEs is merely a symptom of a much 
wider characteristic of the high-level (z.e., shallow) laws pertaining to DEs— 
namely their defeasibility. I have argued in detail elsewhere (Fales, [1978]) 


1 My argument here is reminiscent of that of Bennet [1969]. 
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that the ceteris paribus clause which infects most high-level generalisations 
must be in principle redeemable, in the sense that the exceptions to those 
generalisations which cause us to invoke the defeasibility condition them- 
selves require explanation. I further argued that those explanations will 
frequently have to be reductive ones. This immediately suggests that a 
fundamental theory, if it is to be satisfying, must not contain defeasible 
laws: its laws must be statable in such a form that every case is fully 
accounted for. This in turn suggests one difference between ‘essentialist’ 
claims about D&s, e.g., 


All mammals have mammary glands. 
and essentialist claims about FFs, e.g. 
All electrons have a charge of 4.8 X 10-19 esu. 


(supposing electrons to be genuinely fundamental). The former claim 
admits of exceptions, for a mutant mammal may be born without mammary 
glands, and still be a mammal. But as I have shown, exceptions are 
admitted in the latter case only on pain of forfeiting the fundamentality 
of any theory that makes this claim. Since the permanent properties of 
the FEs of a fundamental theory determine the form of the causal laws 
of that theory, and since the essentialist claims concerning these properties 
are purportedly exceptionless, the causal laws themselves will be putatively 
free from exception. 

But universality and necessity—so the essentialist would have it—are 
not the same thing. What kind of necessity is it, precisely, with which 
essential properties ‘attach’ to fundamental particulars? Alethic modality, 
we have been told, comes in various species. There is an austere logical 
necessity, definable in terms of theoremhood in a standard quantificational 
calculus; analyticity, which derives from linguistic conventions; ‘con- 
ceptual’ necessity (e.g., that nothing can be both red and green simul- 
taneously); and nomological or causal necessity. Finally, Kripke speaks, 
somewhat obscurely, of metaphysical necessity, which is necessity ‘in the 
highest degree—whatever that means’. (Kripke remarks parenthetically 
that this kind of necessity might prove to be physical necessity.) 

The most natural view to take here, it might seem, would be that 
essentialist claims of the sort I have been discussing are physically 
necessary: not, of course, because it is supposed that FEs somehow cause 
their essential properties, but because the true classification of FEs depends 
upon what the physical laws of the universe actually are. If we imagine, 
counterfactually, these laws to have been otherwise, then in general we 


1 Kripke [1972]. 
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must imagine some concommitant change in the system of FE sortals. 
Since the laws of nature are physically necessary, the essential properties 
of FEs are at best physically necessary. This position, if it is to be a 
genuinely essentialist one, is of course committed to a denial of Hume’s 
analysis of causal relations. 

But the line of reasoning supporting this view contains in any case a 
fallacy. For it does not follow from the fact that the laws of nature are 
physically necessary that the essentialist propositions determined by them 
are. What follows is that it is at best physically necessary that there should 
be the FEs that there in fact are. But it is one thing to assert the possibility 
of there having been different sorts of FEs—which involves at most a 
transgression of what is physically determined—-and another to assert 
the possibility that an FE of a given (existing) sort should fail to have some 
of its essential properties. To imagine the latter we should not merely 
have to imagine a possible world in which the actual laws of nature do 
not obtain; we should have to imagine a possible world in which some of 
our FEs exist and fail to have some of their essential properties. It is at 
best causally necessary that certain FEs exist rather than others; but it is 
necessary in the ‘highest degree’ that an electron (if that is an FE) have 
the quantum-mechanical properties that it does have. If something didn’t 
have those properties, it wouldn’t be an electron, and no change in the 
laws of nature could make it one. 

Thus I incline to Kripke’s view that essentialist claims are metaphysically 
necessary and not—barring the eventuality of their being found identical— 
merely physically necessary. Having said just this much, however, leaves 
matters in a rather unsatisfactory position—first, because it does not give 
a complete answer to the question originally posed, and secondly, because 
that answer makes use of the unelucidated notion of metaphysical necessity. 
The original question, it will be recalled, concerned the nature of the 
connection between an FE and its essential properties. But while I take 
FEs to be substances, I have given no metaphysical analysis of the notion 
of a substance, and hence none of the relation between a substance and 
1 A defence of this claim would be along Kripkean lines, and lies beyond the scope of this 

paper. The argument would appeal to some version of the causal theory of reference. 
Suppose, e.g., that electrons are FEs; then, in outline, we fix the reference of ‘electrons’ 
as whatever particles produce a certain systematic set of causal effects upon our measuring 
instruments (ultimately upon our sense experience), where both ‘electrons’ and the 
definite description ‘whatever...’ are construed as rigid designators which are co- 
referential, not synonymous. Then what the referent of the definite description is depends 
upon the causally relevant properties of electrons, but the identity itself is necessary 
‘in the strongest sense’. We can imagine possible worlds in which our experiences are 
the same as in this world, but in which the fundamental laws which govern the behaviour 


of this world’s electrons do not obtain. But such a world would a fortiori not be a world 
in which electrons exist, 


368 Evan Fales 


its properties. These tasks, while necessary and fundamental, must be 
set aside for the purposes of this paper. 

A further challenge to the essentialist view I am defending must and 
can, however, be addressed. It has sometimes been urged that the claim 
that an individual has certain essential properties—in particular, those 
defining the natural kind to which it belongs—is a purely conceptual 
thesis, z.e., a thesis whose truth or falsity can be determined by an appeal 
to our conception of that individual. The idea is that, once we understand 
the nature of the natural kind to which an individual belongs, then we 
cannot conceive of that individual, an individual of that kind, as surviving 
a change into a different natural kind. And this, I am claiming, is correct, 
if a full-fledged scientific understanding of the natural kinds in question 
is presupposed; but it seems to fall prey to certain alleged counter- 
examples. The Greek myth concerning Proteus provides a typical example 
of this sort. Can’t we, after all, conceive of a human being who is trans- 
formed into an animal—or even into water—and back into a human being, 
retaining his identity all the while? Is not the story of Proteus in some 
sense intelligible to us, regardless of how much scientific knowledge we 
may possess concerning men? 

One way of answering this challenge is to insist that what makes Proteus’ 
history intelligible—assuming that it is—is that our criteria for re- 
identifying persons (and ancient gods) need not include any facts concern- 
ing bodily form so long as certain intentional characteristics are retained. 
Even in his aquatic state, after all, Proteus is trying to elude Menelaus, 
and has memories of the past. Or so we are asked to assume. 

But this response only defers the question. If the story concerning 
Proteus ts intelligible, might not a similar, equally intelligible story be told 
about, for instance, a certain actual tree? In either of these stories, zs what 
we are asked to assume actually possible? Probably it is not, and if not, 
this shows that what we can im a certain sense conceive may, in some 
instances, not be what is possible. I suggest that the way such mythological 
contexts are to be understood is this. What identity-preserving meta- 
morphoses of an individual are metaphysically possible is a function of 
what relevant laws of nature are, at least in the case of non-conscious 
natural objects. But we may not know what those laws are; what is 
epistemically ‘possible’ may not be metaphysically so; however, our ability 
to refer to an individual does not—indeed could not—presuppose know- 
ledge of its essence. This fact provides the clue to our understanding of 
such stories. For even if we do have full knowledge of an individual’s 
essence, we can abstract from this knowledge (t.e., ignore it) without 
sacrificing reference. Abstracting in this way, we are able to conceive of 
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the relevant physical laws having turned out to be other than they are. 
In entering into our role as audience for the myth, we are in effect being 
asked to suspend, if need be, whatever beliefs we may have concerning 
whatever actual laws of nature preclude the mythical metamorphoses. We 
are asked to assume that whatever the laws are which govern the world 
of the myth, they are laws which would, were they to obtain, permit the 
requisite re-identification under a single natural-kind sortal. No particular 
myth-world laws need to be specified in order for us to engage in this 
general act of imagination, for just as, in the actual world, putative 
natural-kinds are identified or at least assumed to be eventually identifiable 
while we are still innocent of the theories that ultimately justify these 
classifications, so too we can imagine such terms to be introducible in the 
‘world’ of the myth. So what we are asked to accept has something of the 
character of a promissory note: just as we may not know the laws that 
govern this world, so what we accept as audience to the myth is that 
. unspecified laws we could conceive to obtain would permit the re- 
identifications in question. 

Among the various doctrines which might be called essentialism, one is 
the view that there are natural kinds. It is a view which has fallen into 
disrepute since—and partly as a result of—the writings of John Locke. 
It does, I believe, stand in need of an important modification. I have 
argued that insofar as one is committed to the truth of a fundamental 
theory, one is thereby committed to certain strong restrictions upon how 
the FEs of that theory may be classified. One might always classify these 
differently, but against the background of that theory such classifications 
will not be felicitous. Theories commit us to viewing certain truths about 
FEs as necessary and others as contingent; hence particular FEs may be 
said to have essential properties. 

Of course I do not deny that every empirical theory is in principle open 
to reformulation or rejection; nor do I deny that a theory which 1s at one 
time reductively fundamental, even if it is satisfying, is open to the 
possibility of being reduced in turn by some deeper theory. Just as theories 
are subject to recall, so too must our faith in the alleged necessary truths 
and classificatory criteria for FEs which they involve be open. What we 
have are putative a posteriori necessities which are subject to rejection 
and an essentialism which is subject to empirical control. 

The dominant concerns of this essay have been the nature of funda- 
mental theories and the logic underlying the relation of explanation to 
classification. It is possible to see now how all three matters are related 
to one another. Explanatory concerns will dictate in matters of classifica- 
tion; and an account of the sense in which the a posteriori necessary truths 
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stating classificatory criteria are open to empirical revision can be given 
only insofar as one has an account of the way in which theories are open 
to question in the face of empirical evidence or alternative theories. 


The University of Iowa 
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Discussions 


DEGREES OF TRUTHLIKENESS: FROM SINGULAR SENTENCES 
TO GENERALISATIONS 


x Several philosophers have recently proposed definitions for the degree of 
truthlikeness of first-order generalisations (see Tichy [1976], [1978]; Niiniluoto 
[1978a,b] and Oddie [1978]). In spite of a common technique, viz. Hintikka’s 
distributive normal forms, the definitions of Tichy and myself differ from each 
other in many important respects (see Niiniluoto [19785], pp. 287-90, 304-6, 
317, 319-20). 

To give an example of these differences, let L be a monadic language with 
primitive predicates À = {O,, Oa, Os} and let Cy, Ca, and C. be the constituents 
in L illustrated in Fig. 1. (A constituent in L claims that the shaded cells and 
only they are instantiated. C’, is assumed to represent the truth.) 


y 
1 





Figure 1. 


Then Tichý would claim that C, is closer to C. than C, is, while my measures 
dand d, (op. cit., pp. 302~3) give the opposite result that C} is closer to C’, than 
Ca is.? 

Is there any way of resolving such a conflict between different explications? 
To say only that Tichý and I have ‘different intuitions’ about truthlikeness is 
not at all helpful. In a similar situation I have earlier given an argument of the 
following form: if principle A were valid, then Tichy’s intuitive judgment 
G about example X would be justified. However, 4 is not valid, and Tichy’s 
own formal definition is incompatible with A. If I now would conclude on these 
grounds that judgment G is false, then I would indeed be guilty of a fallacious 


1 For the readers of Niiniluoto [19784], I use the opportunity of correcting the following 
misprints: 
p. 283: connect ‘Niiniluoto (1978a) and ‘Tichý (1978) by line in Fig. x. 
p. 302, line 17: for ‘W.C.’ read “W.K? 
p. 303, line 9: for ‘d? read ‘dy’ 
p. 303, line—7: for ‘L? read ‘LF?’ 
p. 318, line—7: for ‘RECT —9, read ECT, (* -)” 
p. 318, line—3: for ‘rCt’ read ‘Ct’. 
2 I am grateful to Professor Tichý for bringing (in correspondence) this particular example 
to my attention. 
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inference, as Tichý has pointed out ([1978], p. 195). However, my conclusion 
is rather the weaker claim that, as Tichý cannot consistently motivate G by 
means of the most natural explanation A for G, he is obliged to give some other 
justification for G. 

Tichy thinks that intuitive examples are used to test general principles, not 
vice versa. However, I do not think that there is something like a pre-existing 
intuition about truthlikeness which is sufficiently rich to be employed as a basis 
for testing all interesting principles. Therefore, intuitive examples are not only 
used for testing principles, but it is also the case that general principles are 
needed to test and develop our intuition. 

In this spirit, I try to explain in this paper how one can come to the two 
different conclusions about the example in Fig. 1. It seems to me that there is 
a natural (but problematic) way of motivating Tichy’s judgment in this situation 
and if he does not subscribe to it, it is up to him to explain what alternative 
reasons he has for his view. 


2 In my first paper on truthlikeness, I proposed definitions for the degree of 
truthlikeness of monadic generalisations and of simple singular monadic state- 
ments (see Niiniluoto [1977], p. 143; see also Niiniluoto [1978b], p. 300). These 
two definitions are in principle motivated quite independently of each other, 
and therefore I have ever since been puzzled by the question of how these two 
measures are related. In particular, 1s it possible to obtain the former as some 
function—perhaps as a limit—of the latter?! To study this question, let us 
start from the singular statements and try to reach in the end the case with 
generalisations. 

Let {0,,..., Ox} be a classification system with K mutually exclusive cells 
Q; i = 1,..., K, and assume that the distance between cells Q, and Q; is diz. 
In the simplest case, cells Q, are defined by the Q-predicates 


(+) O: (x) & ©. . & (+) Op (x) 


of some monadic first-order language L with A = {O,,..., Ok}, where K = 2°. 
Then d,,; can be defined simply as the number of different conjuncts in the 
definitions of Q; and Q,, divided by k, so that d,,<{o, 1/k, 2/k,..., 1} 

Assume that we have to place n individuals a}, . .., a, to the classification 
system {Q,,..., Ox}, and that the correct place for a, is Qg, 1.e., Qg, (@,) is true. 
If we place a, in cell Q,,, then our mistake is measured by 


m(Qi,(43)) = din, (1) 
Our total mistake with respect to the sequence a,, ..., a, can obviously be 
measured by the average error: 
{I ø 
mOn (as) &… &Q, (m) = (2) (2) 


1 That this question is related to the above example waa realised by me when I tried to 
find an answer to a stimulating letter of Mr Graham Oddie. 

* In some cases we may also assume that d = 1 for allt andj. For more general situations, 
dealing with families of predicates and with the conceptual spaces induced by such 
families, see Niiniluoto [19785], pp. 300-1. For polyadic constituents, see ibid., pp. 
313-19. 
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If Xin is the number of a,’s,j = 1,..., n, with m(Q,, (a;)) = i/R, then (2) can be 
rewritten in the form 
I È Xin 
kion 


(3) 


Formula (3) thus gives a definition for the distance of state-descriptions of the 
form Q; (a4) &... &Q,, (a,) from the truth. 

Assume now that the number of individuals a,,..., 4, grows without limit, 
i.e., n-> 00, Let y; be the limit of x,,/n, for nœ, so that yy+...+y, = 1 and 
yı: is the limiting proportion of individuals a, with a mistake of size 1/k. Then 
the distance of infinite state-descriptions from the truth is given by 


co k 
mA Qu (a) = 3 Èi, (4) 


which is obtained as a limit of formula (3).+ 

If our task is to place the correct number of individuals in different cells O,, 
rather than to take care that each a; goes to its own place, we are interested in 
structural descriptions rather than state-descriptions. Structural descriptions are 
disjunctions of state-descriptions, and they essentially state how many indi- 
viduals there are in the different cells—or, in the infinite case, what is the 
limiting proportion of individuals in different cells. If Sis a structural description 
generated by A Q,, (a,), then its distance from the truth can be defined by 


min m( A Qi, (ra) (5) 


where v is a permutation of {ap | n = 1,2,...}. 

Measure (4) seems to justify Tichy’s claim about the example given in Fig. 1. 
There C can be regarded as an infinite conjunction claiming that all individuals 
belong to the cell O; & O, & Og, while C, claims that all individuals belong to the 
cell ~O, & ~O, & + Os. Thus, for C relative to C., we have ya = Y1 = Ya = 0 
and y = 1, which shows that the distance of C, from C. is maximal (1.¢., one). 
On the other hand, C, is an infinite disjunction of (infinite) state-descriptions for 
which y, > 0, Yı > 0, Ya > 0, and ya =o. Any of the latter is closer to C, 
than C, is. In this light, the motivation for Tichy’s claim is the fact that 
according to C, all the individuals of the universe must be closer to their true 
places than according to C}. 

It is interesting to note that Tichy’s definition of truthlikeness, when applied 
to the monadic cases, can in fact be obtained as a spécial case of formulae 
(4) and (5). Let us say that an infinite conjunction is equally distributed if the 
limiting non-zero numbers of individuals in different cells are equal to other. 
Let us then stipulate that in evaluating the distance between two constituents 
C, and C, the wider? of them is correlated with an equally distributed infinite 
conjunction B, compatible with it and the other is correlated with an infinite 


1 This definition has the interesting feature that y; may be zero even if +4, > o. Thus, if 
the number of mistakes of size i/k is bounded from above by some finite number, 
theses mistakes are not counted. 

3 The width of a monadic constituent is the number of cells which it claims to be in- 
stantiated. 
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conjunction Ba, such that the distance of the structural description generated 
by B, from B, (in the sense of (5)) is the minimum. This minimum distance 
then equals the value which Tichy’s definition gives for the distance between 
C, and C, (cf. Tichý [1978], p. 181). 

For example, Tichy’s definition entails that in Fig. 1 the distance between 
C, and C. is 


O+I+1+1-4+24+2+2 3 


3434343434343 7 


The same value is obtained by correlating C, with an infinite conjunction 
claiming that the seven shaded cells are occupied by the same number of 
individuals; for this conjunction relative to C. we have yy = 1/7, ¥1 = 3/7, 
Ya = 3/7, and y, = o, so that formula (4) gives for the distance 


A(h.ofF.rt$.2+0.3)=4 


3 The given ‘derivation’ of Tichy’s measure shows that his notion of truth- 
likeness can be motivated by assuming that monadic constituents can be viewed 
as equally distributed infinite structural descriptions. It shows—for me at least— 
why one could be inclined to define degrees of truthlikeness for constituents in 
Tichy’s manner. At the same time it may make visible some of the accidental 
features or limitations of his view-—we may, for example, ask why B, is always 
chosen as equally distributed even though B, will in many cases not satisfy this 
condition. 

Without going into such problems here, I shall try to show how the derivation 
of Tichy’s measure may clarify the differences between his and my approaches. 
Let us start from the definition of a monadic constituent (without identity): 
it is a K-fold conjunction telling which cells Q; are exemplified and which are 
empty, i.e., of the form | 


(=r) (3x) Q, (x) & . . . & (+) (4x) Ox (x). (6) 


In the claim (6), not only the permutation of individuals is considered as 
irrelevant, but also the question of precisely how many (if at least one) individuals 
there are in the different cells. This sort of statement is not so much a con- 
struction from singular sentences as a ‘painting’ in a conceptual system, with 
white and shaded cells (cf. Fig. 1). As the cells are in this case the primitive 
carriers of information, it is natural to define the distance between constituents 
so that it reflects the differences between the claims the constituents make about 
these cells. The simplest possibility of doing so, t.e., what I call the Clifford 
measure do, is to define the distance between C, and C as the relative number 
of different claims about the cells; this gives the result that in Fig. 1 do(C,, C+) 
= 2/8 = 1/4 and d,(Cy, C.) = 6/8 = 3/4. Another measure d,, which also takes 
account of the distances between the cells, gives the results d,(C,, C.) = 4/3 
and d,(Cy, C.) = 44444424348 = 3. 

It is immediately clear that such measures as do and d, cannot be obtained 
in the same way from singular sentences as ‘Tichy’s measure, t.e., via (1)-(5), 
since they crucially depend upon information which is lost already at the stage 
(1): measures do and d, are intended to be sensitive to the number of different 
kinds of errors, not to the cumulative number of ‘individual’ errors. First, as 
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soon as a particular type of mistake has been made once, it becomes irrelevant 
whether we repeat it several times. This condition is not satisfied by definition 
(1). Secondly, definition (2) fails to take into account the fact that errors of the 
same size may be of different kinds.+ 

To illustrate both of these points, note that according to (2) the sentences 


~O; (81) & O; (a) & ~O; (a) & O; (a) 


and 


~O; (a) & O; (a) & O, (ag) & ~ Oa (as) 
are equally distant from the truth 


~ O; (a1) & ~ Os (a1) & ~O, (a) & ~O (as). 


Corresponding to this fact, Tichy’s measure gives the result that in Fig. 2 
constituents C, and C, are equally distant from constituent C,. 


O, ~O, 


o| T =e 
A E 





Moreover, constituent C, which claims that all the four cells are exemplified is, 
according to Tichý, equally far from C. as C and C}. 

One way of defending the claim that C, is closer to C, than C, is the following: 
of the fourteen universal generalisations which can be formulated in language 
L with A= {O,, O,} seven are true; Cg is compatible with three, and C, with 
only one, of these true universal generalisations, Measures d, and d, thus reflect 
our interest in finding true universal generalisations. They thus seem suitable 
for the purpose of defining a notion of truthlikeness which is applicable pri- 
marily to scientific theories—that is, to genuine generalisations which are not 
to be viewed as infinite conjunctions of singular sentences. 


ILKKA NIINILUOTO 
University of Helsinki 


1 While do takes all mistakes to be of the same size, and while Tichy’s measure does not 
recognise a difference between different kinds of errors of the same size, measure dy 
takes both of these factors into account. 

It is clear from the computations given above that Tichy wants to use average errors 
where dy uses sums of errors. It is also easy to see that the approach with averages leads 
to strange results in many cases. For example, let X be a substance which has two 
isotopes with atomic weights, say, 230 and 250. Assume that two physicists, Jack and 
Tom, both claim that X has three isotopes with weights 230, 250 and 270 (theory 7)). 
Then Tom changes his rnind and starts to support a theory T, according to which X in 
fact has four isotopes with weights 230, 249, 251, 270. The average error of T, is now 
less than that of T,; therefore Tichy is, oddly enough, committed to saying that Tom's 
new theory is closer to the truth than Jack’s theory. 


376 Ilkka Nuntluoto 
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LIGHT SIGNAL SYNCHRONISATION AND CLOCK TRANSPORT 
SYNCHRONISATION IN THE THEORY OF RELATIVITY 


In this note I will show that the requirement of agreement between light-signal- 
synchronisation and clock-transport-synchronisation leads to an unambiguous 
description of the behaviour of moving bodies and clocks, and that from this 
requirement the principle of the impossibility of measuring the one-way-velocity 
of light follows. 

To be able to perform any calculation in physics, it is necessary to define 
first an appropriate labelling of the point-events. The labelling can be made by 
means of some rather arbitrarily chosen operations. The calculations, however, 
are of a particular simplicity and elegance if this labelling involves synchronisation 
of distant clocks according to Einstein’s well known synchronisation rule by 
means of light signals. 

To each type of labelling (t.e. also to each type of synchronisation of distant 
clocks) an appropriate formulation of the laws, describing the behaviour of 
moving bodies and clocks has to be found. 

Using the labelling which involves the Einstein synchronisation of distant 
clocks, and using the results of the Michelson—Morley (or Kennedy-Thorndike) 
experiments, we obtain a set of rather general transformations, one trans- 
formation for each pair of inertial frames (Podlaha [1975]). The transformation 
for the shift from the ether frame S° to another frame S is given by: 


Hin my (x—wse st) 
Œ no NS a 
y’ = Gy 
z' = GSSz 
= GS (t+— ws s/c") 
(1— 2080 s/c) 


where G9 is a coefficient, and go g is a parameter appropriately fixed for the 
two labellings S°,S. When the parameter wg g is interpreted as the relative 
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velocity of the frame S with respect to the ether frame S°, we can rewrite the 
coefficients GS as a function 


(2) GSS i GS (D ge 8) 


It is important to note that the metric of the space R4, given by the ‘light 
geometry’, still leaves the coefficients GS indeterminate. The coefficients 
GS,5* for the transformations between the inertial frames S‘ and S* can be 
calculated from the coefficients G$°$ (Podlaha [1975]). The values Ÿ = 1/695 
represent physically the transverse contractions of bodies, moving with the 
velocity wg g with respect to S°. Because the transverse contraction of moving 
bodies Ÿ is related with the longitudinal contraction ®, and with the slowing 
down of the clocks 22 by the relations 


(3) D = R. Y, Q = RY, R= (1—w*/c%i, 


the factors G5™S are also decisive for the values of the length contraction and 
time dilatation. In the following we accept as a working hypothesis that 


G2) T (1 —w09/c*)~ 177, 1.6. Q = (1—20*/c*)~7, 


where y is still an indeterminate constant parameter ye(—1/2,0). 

Now we introduce the operation of clock-transport-synchronisation. In the 
frame S, moving with respect to the ether frame S° with the velocity. wo g, let 
us transport the clock C uniformly on a straight line from the point A to 
the point B and back, the points A and B being located on the x-axis. For 
convenience in the calculations, let us use the auxiliary ‘light geometry’ network 
(with the Einstein synchronisation of distant clocks) to define the relative 
velocity v of the Clock C with respect to the frame S at the motion of C from 
A to B, and the relative velocity —v at the motion from B to A. Later, however, 
the auxiliary network will be eliminated. 

In the frame S let Ry = 0, Rp, R, be the readings of the clock C at the start 
from A, at the arrival at B, and at the arrival back at A, respectively, and let 
T, =o, Tand T, be the corresponding time coordinates measured in the S 
frame. 

Let us define the operation of the clock-transport-synchronisation by the 
equations 


(44) TSYN — Rit AT 

(4b) AT = Ta 
T 

(4c) Ty = = 


From the equations (4) it follows that 


(sa) ryn = Re AR rp, 


- where 


(5b) AR = R,—R, 
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The observables R, and AR are given by following integrals (see Podlaha and 
Sjödin [1977]) 


(6a) Rp = a= |" aveina 
ri 

(6b) AR = Ipa = |. (1—2 fe?) "Vds 
B 


where wip W,, are the parameters characterising the state of motion of the 

clock C in the ether frame S° with the movement from À to B and from B to A. 
T$ and TS are the arrival times given in the S° frame. Taking as a basis for 

the calculations the above mentioned auxiliary networks we can further write 

(see op. cit.) 

Go /e?) (uct)? 


(7 ue = CRIS 
(8a) rg = CRD mate 

(8) PS = ty (ey 

By substituting (7) and (8a,b) into (6a,b) we obtain 

(92) La = pee (atu Lie) 
(9b) Ina ae Ta PU LA) 


Because, in addition L =v Tp, and T, = 2Tp we obtain immediately that 
Iig = Ipa if and only if y = —1/2. 

It can be seen that the equivalence of the A NEE 
with the ‘light geometry’, ie. TT = T, holds if, and only if, Rẹ = AR, 
Lan = Ias 

An objection could be made to this type of proof, namely that the clock C has 
to be, at the point B, immediately decelerated and again accelerated in the 
opposite direction. This singularity at the point B can be, however, avoided by 
using another auxiliary clock D, moving with the velocity —v, and meeting the 
clock C exactly in the point B. 

Moreover, the operation of the clock-transport-synchronisation can be 
generalised to the case where the clock C moves on a circular path. This situation 
can be physically realised by means of a clock moving on a rotating disc. The 
calculations in such a situation (see bid.) lead to the integrals of the type 


T3 
(10) I = | (1—v/e®)~? (14u vele) tdt 
Tı 
It is immediately seen that also in this case J, ,== J,, if, and only if, y = —4. 


The operation of the clock-transport-synchronisation, performed by means of 
the clock C moving on a circular path is fully equivalent to the operation of 
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synchronisation, performed by using the clocks C and D, moving uniformly 
in a straight line in opposite directions. Therefore the auxiliary network can be 
eliminated. 

This fact is also related to the well known ‘clock problem’. It shows that 
Griinbaum’s [1954] explanation of the clock problem, using three clocks moving 
on a straight line is in principle fully equivalent with our [1977] quite general 
explanation of the clock problem which takes into account an arbitrary curved 
motion of the clocks. Therefore the general theory of relativity is not necessary 
for the solution of the clock problem. 

Lastly, let us look at the term (1-++-u v,/c*)1 +2? in the integral (10). Ify —1/2, 
this term makes it possible to find an operation for estimating the velocity u of 
the system with respect to the physical vacuum, as well as for estimating the one- 
way-velocity of light. 

In the case y == —1/2, however, the above mentioned term becomes equal 
to one, and the possibility of measuring the one-way-velocity of light, as well 
as the velocity of the system with respect to the physical vacuum is lost. 

For y == 1/2, the clock transport synchronisation and the light signal 
synchronisation are experimentally equivalent operations, and, taken separately 
or together, they cannot determine the one way velocity of light, contrary to 
the incorrect conclusions of Sexl and Mansouri [1977].? 

On a historical note: my statement that the requirement of the equivalence 
of the light-signal-synchronisation and clock-transport-synchronisation deter- 
mines uniquely the behaviour of moving clocks, and determines y = —1/2, was 
already contained in another paper submitted to this Journal for publication in 
June 1977 which, however, could not be published at that time. 


M. F. PODLAHA 
Munich 
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1 There are also epistemological objections against the use of general relativity in the 
calculation of the clock problem. This general relativistic calculation of the clock 
problem makes use of accelerations. Because of this the wrong opinion arises that the 
cause of the clock retardation effects is the acceleration. In fact the acceleration plays 
only an auxiliary role as a measure of the deviation of the movement from inertiality. 
The real cause of the retardation of the clocks is the state of motion of the clocks in 
the physical vacuum. The fact that the acceleration is not the cause of the clock 
retardation can be tested also by another independent experiment (see Podlaha and 
Sjödin [1980]). 

3 Let me also note that my clock-transport-synchronisation is in no way restricted to 
‘slow-clock-transport’ but is valid at ‘rapid-clock-transport’ as well. 
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IMRE LAKATOS’S PHILOSOPHY OF SCIENCE* 
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INTRODUCTION: CONTENTS OF THESE VOLUMES 


These handsome and important posthumous volumes contain most of Lakatos’s 
published work except for Proofs and Refutations (Cambridge University 
Press 1976). There is also a good deal of new material ranging from work in 
progress to old stuff that had been put aside. Since this review attempts a 
general analysis of Lakatos’s contributions, it is well to begin with a brief 
summary of each volume. The first begins with a BBC talk, ‘Science and Pseudo- 
science’, that defines the ‘methodology of research programmes’ and says why 
it should matter. Then there is the long essay on this topic, first published in 
Criticism and the Growth of Knowledge. Next comes an account of ‘rational 
reconstructions’ from a Boston colloquium, followed by Lakatos’s essay on 
Popper for Schilpp’s Popper volume in the Library of Living Philosophers. There 
is a paper written jointly with Elie Zahar on ‘Why did Copernicus’s Research 
Programme Supersede Ptolemy’s?’ and at the end an unpublished essay, ‘Newton’s 
Effect On Scientific Standards’. 

Volume 2 begins with part of an early Aristotelian Society symposium on the 
philosophy of mathematics, followed by a set-theoretically oriented paper, 
‘A Renaissance of Empiricism in the Recent Philosophy of Mathematics?’ Then 
there is an entirely new body of speculations, ‘Cauchy and the Continuum’, 


* Review of Imre Lakatos [1978]: The Methodology of Scientific Research Programmes: 
Philosophical Papers, Volume t and Mathematics, Science and Epistemology: Philosophical 
Papers, Volume a, Edited by John Worrall and Gregory Currie. Cambridge University 
Press. £9.00, pp. vii + 250 and £10.50, pp. x + 285. 
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followed by an earlier unpublished paper of less importance, “What Does a 
Mathematical Proof Prove?’ At the end of this part of the collection is a potpourri 
on ‘The Method of Analysis-Synthesis’, some prepared for a Finnish conference, 
and some written at the time of ‘Proofs and Refutations’ but never completed for 
publication. 

A second segment of Volume 2 is headed ‘Critical papers’. Two of these are 
prompted by Toulmin’s Human Understanding. ‘Then there is part of a debate 
with Griinbaum on crucial experiments, an assault on Carnap’s inductive logic, 
and two further critiques, ‘Necessity, Kneale and Popper’ and ‘On Popperian 
Historiography’. The volume concludes with three feuilletons, one of which is a 
famous open letter to the Director of the London School of Economics at the 
time of the student revolt. 

This is in no way an ‘academy’ edition. Passages have been deleted, e.g. 
remarks to a fellow symposiast, and some intervening stages in the motivation 
of certain papers do not appear. Such cuts are sensible, and exclude repetitiob 
or irrelevance. Editorial footnotes are sparse but sound. We owe a special den- 
to J. P. Cleave who prepared the Cauchy paper. He adds much useful supplement 
tary information without being intrusive. The index has eccentricities such as 
no entry for ‘methodology’ in Volume 1, a slip which rides ill when we read 
Lakatos bullying Toulmin for not having the word ‘understanding’ in his index. 
The editorial preface contains only one piece of information beyond bald facts, 
and it is just the mght thing to say: “Although Lakatos perhaps came to be 
better known for his work in the philosophy of the physical sciences, he regarded 
himself primarily as a philosopher of mathematics’. The Press has not only 
made an accurate printing but has taken pains for example to re-do pages that 
it had already set up for previous publications. The reader, in short, has been 
well served all round, except that there should have been a brief description of 
what other writings Lakatos left behind. 


XI THE PROBLEM OF READING THESE PAPERS 


Lakatos hoped that he would write The Changing Logic of Scientific Discovery, 
a book he announced as ‘forthcoming’ but which, as his editors say, he was 
never able to start. We note that several of the papers arise from the three 
volumes of Lakatos’s conference at Bedford College, London, in 1965. There is 
hardly an essay published by way of submission to journals. We have papers 
extracted from Lakatos for Festschriften, conferences and the like. The Cauchy 
paper was accepted by this Journal but persistently withheld. I fear there is no 
reason to think there would ever have been a Changing Logic. 

Yet there is a real need to invent a master book that locates these essays. This 
is not because Lakatos does not try to say what he is doing: he is always doing 
so, and constantly setting his work within his view of the history of philosophy. 
But we have the not unfamiliar spectacle of a writer whose placings of himself 
are not always those that help. If we read these essays as they come we get a 
body of doctrine that is immediately entertaining but collectively not very 
coherent. Indeed one has a slight feeling of paradox. On the one hand this 
philosophy is plainly of the first magnitude. Even readers whom Lakatos 
infuriated have to grit their teeth and admit that he is a disconcerting prominence. 
Yet the honest reader with historical interests can hardly avoid exclamations 
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such as ‘Lakatos’s absurd historiography’,! or ‘historical parody that makes 
one’s hair stand on end’. The philosopher will find himself baffled by a 
‘methodology’ that seems to reject method, or with a concept of ‘rationality’ 
that abolished the very idea of ‘being a reason for’, The working scientist finds 
a key notion of ‘research programme’ that excludes most research programmes. 

The problem of reading these Papers is then to find some underlying problem 
and strategy that will explain to us how their scintillating but sometimes absurd 
surface lies over a fundamental contribution to the philosophy of knowledge. 
In posing my paradox I do not deny that the books give much instruction, 
provocation and pleasure on first reading. They abound in haphazard but 
insightful historical lore. There are several remarkable re-creations of Popper, 
and an elimination of Carnap’s inductive logic, which, in bold detail, seems to 
me to be along the right lines. There is a lot of inflammatory material about how 
to engage in the history of philosophy of science. There is a little tantalising 
mystery-mongering about Popper’s third world, and a good many fine aphorisms. 
Some of the essays have too many forgettable ‘isms’ in their design, yet their 
simplifying arrangement of philosophical positions does get a hold on one 
notwithstanding. There are imaginative speculations about Cauchy and Newton 
and Copernicus. But there is a danger of going through these five hundred 
pages without imagining what they are all about. I find myself forced to an 
exceedingly simple thesis about their subject matter. It is not one that Lakatos 
would likely have admitted, but I still believe that it grants to his work the 
kind of stature that it deserves. 


IZ TWO AUDIENCES 


This review is about what Lakatos wrote and not his subjective or personal 
motivations. But for purposes of exposition only it is worth noting that a 
philosophical emigré may very naturally have a listener on each shoulder, and 
by dint of unwittingly addressing both, fail to make plain what is being said 
to either. On the one shoulder is a thoroughly Hegelian and somewhat Hungarian 
conception of the events of modern philosophy, a body of historical conceptions 
that Lakatos takes for granted, hardly stating them. On the other shoulder are 
the English, whose scientific values are just what Lakatos wants, no matter 
how ignorant and insular the philosophy that runs alongside them. 

For example, modern English philosophy is wedded to a conception of truth 
as a representation of reality. To this it has annexed various values of objectivity, 
communication and adversary discussion. Lakatos would like to authorise 
those values without having the philosophy associated with them. On his 
Central European side, representational theories of truth were put to an end 
by Kant. The only postcritical English philosopher for whom Lakatos con- 
sistently has a good word is Whewell, and that furnishes a useful comparison. 
Whewell had both mastered Kant and become permeated by historicism, yet 
tried to maintain what is in a commonplace way right about the inductive 
sciences. “Ihe Fundamental Antithesis of Philosophy,’ wrote Whewell, is 


1 Nathan Reingold, reviewing other books in Isis, 68 (1977), p. 625. 
2 Holton [1978], p. 106. 
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indicated by ‘the terms subjective and objective.”! Lakatos’s problem ts to provide 
a theory of objectivity without a representational theory of truth. 


1.3 THE GROWTH OF KNOWLEDGE 


The one fixed point in Lakatos’s endeavour is the simple fact that knowledge 
does grow. Upon this he tries to build his philosophy without any representa- 
tionalism, starting from the fact that one can see that knowledge grows whatever 
we think about ‘truth’ or ‘reality’. Four related aspects of this fact are to be 
noticed. 

First, one can see by direct inspection that knowledge has grown. This 1s 
not a lesson to be taught by general philosophy or history but by detailed 
reading of specific sequences of texts. Read the material that lies behind ‘Proofs 
and Refutations’, that is, read the mathematical work stemming from Euler’s 
conjecture about polyhedra. There is no doubt that more is known now than 
was grasped by the genius of Euler. Or to take an example from the methodology 
of research programmes paper: it is equally manifest that after the work of 
Rutherford and Soddy and the discovery of isotopes, vastly more was known 
about atomic weights than had been dreamt of by a century of toilers after 
Prout had hypothesised in 1815 that hydrogen is the stuff of the universe, and 
that atomic weights are integrable multiples of that of hydrogen. I state this 
trivial point to remind ourselves that there ¢s a trivial point which is the starting 
point of Lakatos’s work. Note that the point is not that there is knowledge 
but that there is growth; we know more about polyhedra or atomic weights than 
we once did, even if future times plunge us into quite new, expanded 
reconceptualisations of those domains. 

Secondly, there is no arguing that certain cases exhibit the growth of know- 
ledge. What is needed is an analysis that will say in what this growth 
consists, and tell us what else is growth and what is not. Perhaps there are 
people who think that the development from Euler or the discovery of isotopes 
is no growth, but they are not to be argued with. They are likely idle and have 
never read the texts that exhibit the growth (or perhaps they think that we 
claim there is certain knowledge here, and not merely growth of knowledge). 
There are indeed writers who urge that some kinds of knowledge are quite 
different from that illustrated by Euler or Rutherford. Thus Habermas claims 
that in addition to positive knowledge there is both a knowledge of society and 
knowledge of interpretation called hermeneutical. That is not a doctrine to be 
debated but to be confronted, and the ground of confrontation is that Habermas’s 
other two kinds of knowledge do not exhibit that growth which is Lakatos’s 
starting point. An analysis of the growth of knowledge is expected, by Lakatos, 
to display what the growth consists in, and that will be a sufficient contrast 
with hermeneutics or the Verstehen-sociology. 

This thought leads to the third point: the growth of knowledge will provide a 
demarcation between ‘rational’ activity and ‘irrationalism’. In Section 2.1 below 
I shall study how Lakatos tries to foist on us a radical change in the conception 
of rationality; for the present, note the shift from Popper’s demarcation problem 
of fifty years ago. Popper arrived at an implicit division into science, metaphysics 


1 Whewell [1848], vol. 1, pp. 29-30. 
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and muck. Metaphysics is the earnest speculation that can some day lead to 
positive science. The logical positivists had science ws. metaphysics-muck, but 
Popper had a better set of distinctions in mind, illustrated by the fact that 
the muck has now organised itself as something apart from speculative meta- 
physics. Lakatos now is willing to lump the metaphysics that becomes science 
alongside science itself, because it is part of the larger growth of knowledge 
that concerns him. Thus metaphysics-science confronts the muck. Popper’s 
contribution to The Positivist Dispute in German Sociology? reads like letters 
from a nineteenth century country vicarage compared to what Lakatos would 
have written. This is partly because Popper ended up writing in terms of the 
science-metaphysics distinction. But the problem has changed. Now Lakatos 
asks how the ‘progressive’ Popperian metaphysics-science bag is to be character- 
ised. Note that I write ‘characterised’, First one recognises the growth of 
knowledge by detailed examples, then one characterises it. One does not defend 
the claim that certain cases exhibit the growth of knowledge, but uses the 
examples to define a new canon of ‘rationality’. 

The first three points I attribute to Lakatos are closely connected: (a) one 
can directly see, in particular cases, that there is growth of knowledge; (b) this 
is not argued for, but analysed; (c) the analysis invites a demarcation between 
‘rational’ knowledge-growing activity and ‘irrationality’. My fourth point is 
that the preceding three are conducted by internal considerations about the 
history of knowledge, and do not depend upon any theory of truth. The 
common English-speaking attitude is that knowledge is growing just if we 
are petting at more of the truth. It is not just that some of us define knowledge 
as justified true belief, but that truth is conceived of as fixed, while knowledge 
is to be defined as that which gets at this pre-existent truth. Hence in English 
philosophy knowledge is to be characterised externally, in terms of how well 
it represents reality. That is exactly what Lakatos is not primarily concerned 
with.? It is a point that requires elaboration. To do so it is useful to resort to 
a potted history all too like those he was so fond of, but with a different subject 
matter and moral than his tales of ‘degenerating justificationism’ and so forth. 


X.4 OBJECTIVITY AND SUBJECTIVISM 


Kant undid the notion that for a proposition to be true it must represent some- 
thing else. He thereby epitomised the birth of a new problem that gnarled its 
way through nineteenth century philosophy: how are we to distinguish the 
objective from the merely subjective, if we are not allowed to say what objective 
truth represents? As implied by Whewell’s ‘Fundamental Antithesis’, objectivism 
and subjectivism form the problematic of more modern times. The objectivist 
is not against truth and reality, but requires some surrogate that preserves their 
values without their precritical naiveté. 

1 Adorno (ed.) [1969]. 

* My colleague Solomon Feferman notes an equivocation in my point (a) that may call 
in question this conclusion. Lakatos writes both of the growth of knowledge and of the 
improvement of knowledge. The former might be characterised internally; it is less clear 
that ‘improvement’ can. It requires great care to avoid saying that an improvement in 
knowledge is not a better account of the truth, or of reality. See his paper “The logic 
of mathematical discoveries vs. the logical structure of mathematics’ in the Lakatos 
symposium forthcoming in PSA 1978, Volume 2, edited by P. Asquith and I. Hacking. 
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Nietzsche and Peirce conveniently illustrate this nexus. The former writer 
tells how the true world became a fable. An aphorism in Twilight of the Idols 
(Bacon’s ‘idols’!) starts from Plato’s ‘true-world—attainable for the sage, the 
pious, the virtuous man’. We arrive, with Kant, at something ‘elusive, pale, 
Nordic, Kônigsbergian’. Then comes Zarathrustra’s strange semblance of 
subjectivism. But that is not the only postcritical route. Peirce tried to replace 
truth by method. Truth is whatever is in the end delivered to the community 
of enquirers who pursue a certain end in a certain way. Various aspects of 
Peirce’s philosophy, especially the fallibilism and the evolutionary epistemology, 
have by now amply been compared to Popper. But the greater novelties in 
Peirce’s thought are seldom recalled: the idea that man is language, that the 
world is not deterministic, and that there is an objective surrogate for truth to 
be found in methodology. Habermas has given perhaps the best critique of the 
last of these three because it is important for him to show that positive science 
has no substitute for truth, and ‘hence’ no unique claim on us.1 I take Lakatos’s 
methodology to be a sophisticated and historicised version of Peirce’s logic of 
inquiry. This is not, of course, to attribute to Peirce Lakatos’s novelties of 
internal history, research programmes, heuristic and so forth. But both writers 
share the post-Kantian aim of replacing representation by methodology. 


1.8 LAKATOS’S DISCLAIMERS 


This view of Lakatos’s problem is mine not his. He tells a story of sceptics 
against dogmatists, with himself on the side of the dogmatists, f.e. those who 
think that there is knowledge to be had. There is a battle engaged in Hellenistic 
times, with the dogmatists too often led by justificationists who try to find 
grounds for knowledge. They are to be replaced by those who find another basis 
for what he cheerfully calls dogmatism and demarcationism. Now of course the 
sceptic or dogmatist casts of mind emerge at various times in history, and 
doubtless they align in a natural way with the postcritical problem of objectivism- 
subjectivism. But what Lakatos does not say is that in the precritical era of 
modern philosophy those grounds for knowledge so ardently sought by his 
justificationists were precisely part of a theory about how knowledge succecds 
in representing reality. I think that his Hegelian side so takes for granted the 
impossibility of a serious representational theory of truth that he does not 
properly characterise his predecessors. This fact also makes it impossible for 
him to identify with the bulk of recent English philosophy, even though he is 
committed to its ideals of ‘objectivity’. 

Potted histories settle nothing. More important are Lakatos’s own explicit 
rejections of the problem I attribute to him. Thus in parentheses he contrasts 
a view of science as a ‘light-hearted sceptical gambit’ with the ‘more serious- 
fallibilist venture of approximating the Truth about the Universe’ (volume 1, 
p. 114). He says that to do this ‘one needs to posit some extra-methodological 
inductive principle’. But although he does take up the theme of an in- 
ductive principle from time to time he never does posit such a principle. 
I take it as no accident that I have just quoted from a nérvous passage in paren- 
theses, in which we find “Truth about the Universe’ written with ironic capital 


1 Habermas [1968]. 
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letters. A footnote to the parentheses directs us to a place where the theme is 
to be taken up again: but it is not there discussed with any seriousness. He does 
elsewhere urge on Popper a ‘whiff of inductivism’ that will generate at least a 
metaphysical (untestable) conjecture about the hook-up between the growth 
of knowledge and the approach to truth. But Lakatos seems to have seen pretty 
clearly that one would get nowhere with Popper’s hokum about the relevance 
of Tarskis theory of truth to this project, or the subsequent doctrine called 
‘verisimilitude’. The preponderant evidence of all the texts published here is 
that Lakatos expended no effort on an inductive principle, and yet constructed 
sometimes quite bizarre notions that would serve as an effective surrogate for 
a theory of truth. 

Two quite different kinds of philosophers may well urge that if my reading is 
correct, then Lakatos must be mistaken. One of these kinds of philosopher is a 
generalist, the other, a particularist. The generalist says that Lakatos needs an 
inductive principle to be assured that the general laws and theories of which 
Lakatos writes do have some prospect of converging on the truth about reality. 
On my account this objection simply misunderstands the radical nature of 
Lakatos’s project. The particularist, in contrast, asks how we can be confident 
about particular future matters of fact. T'o use the example often used by 
Lakatos in conversation, if you want to get from the fourth floor down to the 
ground, why take the elevator rather than jump out of the window? This is a 
dramatic version of the Humean questions, What next? What to do now? 
Unlike the generalist the particularist does not misunderstand Lakatos, but 
since the Papers under review say nothing about the elevator problem, I shall 
not discuss it. But it does seem to me that this problem does not want any 
global inductive principle. It is to be answered partly by recalling Hume: for 
most of us it is a matter of habit, not reason, that we take the elevator. Contrary 
to Hume’s expectations, however, we can now supplement this by analyses 
based on one or the other school of philosophical probability. These can show why, 
relative to some background beliefs in general theories about the world, our 
habits with elevators are reasonable. Thus despite Lakatos’s occasional paren- 
theses about an inductive principle, one may argue that Lakatos quite wisely 
never states nor employs one. 


@#I METHODOLOGY 


‘Methodology means the science of method. One expects it to give advice 
about what methods to employ to achieve some end. It should be a forward- 
looking classification of techniques, studying choice between competing pro- 
cedures and courses of action for the future. Sometimes Lakatos does use the 
word in this, its proper sense. His methodology of research programmes teaches 
that ‘one must treat budding programmes leniently; programmes may take 
decades before they get off the ground and become empirically progressive’ (p. 6). 
That is agreeable generosity and open-mindedness but not news. Lakatos also 
seems to use the word ‘methodology’ as the name of his philosophy of science, 
where the literal methodology I have quoted is only a corollary. What he names 
‘methodology’ is something backward-looking. It is a theory for characterising 
real cases of the growth of knowledge and distinguishing them from imposters. 
Nor is it claimed that with sufficient hindsight we can move to foresight, 
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‘inductively’ guessing that a long-standing progressive programme will go on 
progressing. The methodology is simply backward-looking. 

Lakotos shifts ‘Methodology but he takes ‘rationality’ even further from 
common acceptation. It may be difficult to absorb how radical his claims are. 
He frequently says he is casting out ‘instant rationality’. In particular he is 
against the idea that crucial experiments can in a moment decide between 
competing theories. That view is at present rather commonplace. In addition 
he is abolishing the entire philosophical project of trying to analyse ‘being a good 
reason for’. Consider Lakatos’s favourite recent examples. Carnap hoped that 
he could analyse good reasons as degree of probability. Good reason for an 
hypothesis, thought Carnap, is high probability in the light of evidence. Popper 
thought no objective probability could do that job, and offered instead the 
procedure of conjecture and refutation. Hypotheses are well-corroborated when 
they have survived vigorous testing, and ‘well-corroborated’ is to stand in for 
‘having a good reason for’—even though the particular bits of evidence revealed 
by testing are not themselves the good reasons.! Like many preceding philo- 
sophers in this tradition, Carnap and Popper tried to give us a notion of, or 
substitute for, a ‘good reason’ that we can use now in assessing or evaluating 
hypotheses with a view to using them in the immediate future. 

Lakatos replaces all that by a theory for examining and sorting past sequences 
of theories to see whether they are degenerating or progressive. ‘The degenerating 
theory is the theory that gradually becomes closed in on itself. To take an 
example I owe to Codell Carter, in the early years of this century the leading 
professor of tropical disease, Patrick Manson, persisted in trying to describe 
beriberi and some other deficiency diseases as cases of bacterial contagion. 
When all else had failed, and one was beginning to know that beriberi was 
caused by lack of something caused by polishing rice, Manson had it that there 
were bugs which lived and died in the polished rice, and they were the cause 
of beriberi. Auxiliary hypotheses are constantly closing in and excluding counter- 
examples by peculiar devices, while the progressive programme responds to 
the new examples with strong new predictions, some of which turn out right. 
But one can only tell what is progressive and what degenerating after the event.? 

I am not now quarrelling with this notion of retroactive rationality. I think it 
makes good sense of matters unintelligible on other accounts. One example is 
the undoubted fact, which few have dared to accentuate before Lakatos, that 
‘most theories are born refuted’. So even consistency with known facts is no 
good guide for future use of a theory. A more familiar point is furnished by 
crucial experiments. Many scholars now agree that experiments*may appear 


1 It is important to take this hard idea literally. A more technical example illustrates 
what it is for one notion to be a surrogate for induction. The statistician Jerzy Neyman 
also thought, that inductive inference is nonsensical, but that it can be replaced by a 
theory of inductive behaviour. He derives ‘confidence intervals’ from numerical data. 
The data are not evidence that an unknown numerical quantity of interest lies in the 
confidence interval: the interval is a surrogate for an inference which we would have 
made, if we believed in statistical evidence. Carnap claims to analyse evidence. Popper, 
like Neyman, says there is no evidence but there is a stand-in for evidence. Lakatos 
says there are no stand-ins. 

3 K, Codell Carter [1977]. For a vignette of an ‘authority’ trying to be neutral between 
a progressive and degenerating programme, see the article ‘Beri-beri’ in the Encyclo- 
paedia Brittamica, 11th edition, 1910. 


Imre Lakatos’s Philosophy of Science 389 


crucial in retrospect but seldom are so at the time of their performance. Lakatos 
says that one theory succeeds over another only after a prolonged period of 
progression opposed to degeneration; a crucial experiment signals the beginning 
of the end, but can be seen to have done so only later. 

Lakatos also makes at least some sense of an otherwise unintelligible old 
debate. Many practicing scientists are immensely impressed when a theory 
predicts phenomena before the theory. A strong band of philosophers, including 
Mill and Keynes, has insisted that this is an illusion. What matters to a theory, 
they say, is its ability to account for the facts, and it does not matter whether 
the facts were discovered before or after the theory. Lakatos sides with Whewell 
against Mill, but he does not give reasons. Rather he makes it true by definition 
that what matters to a theory is its ability to predict new facts. For that is what 
he comes to mean by ‘progressive’, namely ‘the Leibniz-Whewell-Popper 
requirement that the well-planned-building of pigeon holes must proceed faster 
than the recording of facts to be housed in them’ (Vol. 1, p. 100). 

‘As long as this requirement is met,’ he continues, ‘it does not matter whether 
we stress the “instrumental” aspect of imaginative research programmes .. . or 
whether we stress the putative’ approach to truth. Thus he thinks his account 
combines the best elements of ‘voluntarism, pragmatism and the realist theories 
of empirical growth’. This may be misleading, for it suggests he is filtering out 
the desirable elements of various pools of wisdom. In fact these are the words 
of someone who takes the disputes between realist and idealist to be empty. 


22Z APPRAISING SCIENTIFIC THEORIES 


Lakatos is concerned with the demarcation of science. His methodology is 
normative in that it may say, of some past episode in science, that it ought not 
to have gone that way. But his philosophy provides no forward-looking assess- 
ments of present competing scientific theories. There are at most a few pointers 
to be derived from his ‘methodology’. He says that we should be modest in our 
hopes for our own projects because rival programmes may turn out to have the 
last word. There is a place for pig-headedness when one’s programme is going 
through a bad patch. The mottos are to be proliferation of theories, leniency 
in evaluation, and honest ‘score-keeping’ to see which programme is producing 
results and meeting new challenges. These are not so much real methodology 
as a list of the supposedly ‘English’ values in science. 

If Lakatos were in the business of theory appraisal, then I should have to 
agree with his most colourful critic, Paul Feyerabend. The main thrust of the 
often perceptive assaults on Lakatos to be found in Against Method is that 
Lakatos’s ‘methodology’ is not a good device for advising on current scientific 
work. I agree, but suppose that was never the point of the analysis which, I claim, 
has a more radical object. Of course I do not deny that Lakatos had a sharp 
tongue, strong opinions and little diffidence. So he made many entertaining 
observations about this or that current research project, but these acerbic asides 
were incidental to and independent of the philosophy I attribute to him. 

I said earlier that Lakatos is concerned with objectivity and this might seem 
to be connected with theory appraisal. But ‘objectivity’ is ambiguous. A person 
may be objective, in the sense of being disinterested and alert, in deciding what 
courses of action to support. We hope that the patrons of science are ‘objective’. 
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There is also an idea of objectivity quite different from disinterest. It is connected 
with Kant’s question. Does human knowledge have any validity and objectivity 
outside of the realm of subjective human constructions? This, I have claimed, 
is what preoccupied Lakatos. It is natural to think of disinterested ‘objectivity’ 
as a desire to get at the truth, and hence as connected with a belief in Kantian 
‘objectivity’. But the two are independent, especially for anyone who abandons 
a representational theory of truth. 

It is a defect in Lakatos’s methodology that it is only retroactive? I think not. 
There are no significant general laws about what, in a current bit of research, 
bodes well for the future. There are only truisms. A group of workers who have 
just had a good idea often spends at least a few more years fruitfully applying it. 
Such groups properly get lots of money from Foundations. There are other 
mild sociological inductions, for example that when a group is increasingly 
concerned to defend itself against criticism, and won’t dare go out on a new 
limb, then it seldom produces interesting new research. But that has nothing 
to do with philosophy. ‘There is a current vogue of what Lakatos might have 
called ‘the new justificationism’. It produces whole books trying to show that 
a system of appraising theories can be built up out of such rules of thumb. It is 
even suggested that the Foundations should fund such work in the philosophy 
of science, in order to learn how to fund other projects. We should not confuse 
such creatures of bureaucracy with Lakatos’s attempt to understand the content 
of objective judgment in science. 


2.3 HEURISTIC 


Whewell’s word ‘heuristic’ meaning the Art of Discovery is not far from what 
we commonly mean by ‘methodology’. The two words once ran alongside in 
Lakatos’s own work. Heuristic is a theory of finding out, advice on ‘how to 
solve it’. In questions of heuristic Lakatos was an acknowledged disciple of 
his countryman Georg Polya and he may even have hoped for a theory-neutral 
body of techniques of discovery. There is something of this in ‘Proofs and 
Refutations’. There we are taught that when a putative proof admits of counter- 
examples, we should not exclude the examples as monsters, thereby restricting 
the domain of the theory. Instead we should try to find a ‘hidden lemma’ 
concealed in the proof which will explain the existence of counterexamples. The 
best result is a new theorem that not only explains why there are counter- 
examples but also takes them in its stride as special cases of the theorem. Such 
a global revision of a theorem may even lead us to new classes of examples to 
which the proof applies. 

Notice how these features of mathematical heuristic are transferred to the 
methodology of research programmes. Procedures recommended for advance 
in mathematics become the mark of the progressive as opposed to the de- 
generating programme. So what was heuristic now becomes part of the backward- 
looking methodology and in Lakatos’s later work ‘heuristic’ ceases to refer to a 
theory-neutral collection of strategies. Instead each individual research pro- 
gramme is defined by two elements: the hard core of propositions deemed 
central to a theory, and an accompanying ‘heuristic’ that details how this theory 
shall relate to its anomalies. 

Thus I see Lakatos’s attitude evolving as follows. Once there was to be 
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methodology-heuristic, it was forward-looking, telling how to get on with the 
job, ‘how to solve it’. This split into two things. First and foremost is method- 
ology, a backward-looking way to characterise the essence of the growth of 
knowledge. In addition each research programme—each unit for assessment as 
growth of knowledge—has its own forward-looking ‘heuristic’. But aside from 
a few unspecific maxims about proliferation of programmes, modesty and pig- 
headedness, there is no longer any heuristic of a general kind. The logic of 
justification and the logic of discovery have both been dumped in favour of a 
global theory of objectivity that takes in many local strategies for finding out 
about specific domains. Consideration of ‘Proofs and Refutations’ helps one 
to see how this happened. 


24 PHILOSOPHY OF MATHEMATICS 


Euler proved a relationship between the number of edges, vertices and faces 
on polyhedra. Lakatos’s dialogue on this theorem is a philosophical and literary 
achievement of the stature of Hume on natural religion or Berkeley’s Hylas 
and Philonous. It has been widely admired, for example by Quine reviewing in 
this Journal and by Lakatos’s Russian translators. A recent review in Mind 
takes it to be an almost completed theory of mathematical heuristic. Everyone 
should delight in the main text, where philosophical positions and mathematical 
insights emerge in the mouths of the characters, while the footnotes engagingly 
‘chime in’ with the chronological location of those self-same ideas in 200 years 
of mathematical history. But for all the praise of the dialogue’s charm, learning, 
and its lessons for mathematics teachers, it may not yet be adequately appreciated 
for its implication about the content of mathematics. It is seldom noted how 
useful it is to read the dialogue in company with Wittgenstein’s Remarks on the 
Foundations of Mathematics (Lakatos put some rude and somewhat idiotic 
interjections about Wittgenstein into his later publications, but he read the 
Remarks carefully when writing ‘Proofs and Refutations’). Where Wittgenstein 
gives hypothetical illustrations about following rules, diverging practices and 
concept formation, Lakatos gives real life examples. Wittgenstein’s book 1s, in this 
respect, like a bestiary compared to Lakatos’s natural history. But it is not this 
aspect of Lakatos’s philosophy that survives in his writings on methodology. 

A fairly constant target in the dialogue is the ‘Euclidean programme’ of making 
everything certain and infallible. We are told that in the end we can succeed in 
this, but in a strange way. Critical discussion can enable a conjecture to evolve 
into logical truth. In the beginning Euler’s theorem was false; in the end it is 
true because we have come to formulate a concept of polyhedron that makes it 
true. The theorem has been ‘analytified’. Yet making it true by convention was 
not matter of fiat but the product of refined analysis. This doctrine of analytifica- 
tion has unsettling consequences. The Platonist cannot welcome a view which 
makes the truth of the proposition in the end something embedded in the canons 
of mathematical language, where the ideas are stripped of their dignity. They 
are no longer what makes mathematics true, nor the subject matter of mathe- 
matics. Yet the nominalist is equally disconcerted, for even if we end up with 
truth by convention, the convention seems to be organising a ‘reality’ that has 
nothing to do with words. 

Lakatos’s resolution of this tension is hinted at by the word ‘quasi-empirical’ 
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which occurs in the set-theoretic discussion in volume 2, ‘A renaissance of 
Empiricism in the Recent Philosophy of Mathematics’. The paper itself is 
unsatisfactory. It is uncharacteristically ‘dated’, focusing on issues about 
Foundations of Mathematics with which, by the early 1960's, everyone was 
getting bored. Lakatos did not know enough about specific problems in the field 
to revitalise it with his insights. He is arguing that there is a strong ‘quasi- 
empirical’ element in mathematics. The paper really needed an arena different 
from set theory and Foundations, although one sees his point. ‘Foundations’ is 
even by its name the home of the justificationist philosophies to which Lakatos 
is so hostile, and it was important for him to find in that locale a place for 
‘quasi-empirical’ considerations. ‘Proofs and Refutations’ is a better source of 
examples, but he was by then worried by the stock reaction, that there is 
something ‘special’ about Euler’s conjecture, so that we need pay it no heed 
when we think about any other branch of mathematics. But this foray into 
enemy heartland does not have his usual panache. 

Be that as it may, the word ‘quasi-empirical’ is used to indicate the interplay 
of generalisation and example which, Lakatos claims, is an essential part of 
mathematical activity. There is something empirical, at least this: the pro- 
duction of instructive instances. But the instances are not literally experiments. 
A picture of a star polyhedron might even press the point against Euler’s 
conjecture better than an actual star-polyhedron, whereas we do not think 
experimental evidence works like that. Yet the more Lakatos came to doubt 
the observation-theory distinction on the side of the physical sciences, the more 
tempting it was to compare natural science to mathematical activity. This is 
not to say the comparison is simple, for what makes, e.g. propositions of 
elementary arithmetic analogous to ‘basic statements?’ What distinguishes the 
examples of polyhedra that are vital counterexamples to an originally conjectured 
proof? The groped-for answer, I think, is ‘only methodology’. In particular, the 
particular kind of progress, which in ‘Proofs and Refutations’ was still ‘heuristic’, 
and which later provides the same kind of canon of objectivity as we find in the 
physical sciences. 


25 ALIENATION AND THE THIRD WORLD 


A first (but not necessarily important) question in the philosophy of mathematics 
is whether mathematical truth is a human construction or an extra-human 
reality. This is the fundamental break between Platonism and nominalism, and 
characterises a good many other ‘isms’ too. Perhaps the question depends on a 
mistaken dualism between subjective minds on the one hand and, on the other, 
things of which minds can have knowledge. One way to escape this dualism in 
the natural and mathematical sciences alike is to try to do something with 
Popper’s idea of a ‘third world’. Lakatos says little about this, but references 
do appear more frequent as time goes on, and the idea is always cited in favour- 
able terms. It is already foreshadowed in a curious Hegelian panegyric on 
page 146 of Proofs and Refutations: 
Mathematical activity is human activity. Certain aspects of this activity—as 
of any human activity—can be studied by psychology, others by history. 
Heuristic is not primarily interested in these aspects. But mathematical 
activity produces mathematics. Mathematics, this product of human 
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activity, ‘alienates itself’ from the human activity which has been producing 
it. It becomes a living growing organism that acquires a certain autonomy 
from the activity which has produced it... 


We have here the seeds of what later became Lakatos’s redefinition of ‘internal 
history’, the doctrine underlying his ‘rational reconstructions’ and also his 
attraction to Popper’s ‘third world’. One of the lessons of Proofs and Refutations 
is that mathematics might be both the product of human activity and autono- 
mous, with its own internal characterisation of objectivity which can be analysed 
in terms of how mathematical knowledge has grown. 

Popper’s metaphor of a ‘third world’ may be puzzling but the basic idea is 
straightforward even for those who lack an Hegelian background. It is a variation 
of emergentism, an unjustly discredited doctrine of the nineteenth century. In 
Lakatos’s definition, ‘the “‘first world” is the physical world; the “second world” 
is the world of consciousness, of mental states and, in particular, of beliefs; the 
“third world” is the Platonic world of objective spirit, the world of ideas’ 
(volume 2, p. 108). I prefer those texts of Popper’s where he says that the third 
world is a world of books and journals stored in libraries, of diagrams, tables 
and computer memories. To introduce Platonic spirit 1s massively confusing, 
for the third world has little to do with Plato nor with Platonism; indeed the 
third world is better described in nominalistic terms of actual uttered sentences 
organised into theories, problems and the like. 

Stated as a list of three worlds we still have a mystery that makes some 
readers start discussing ‘ontology’. But stated as a sequence of three emerging 
kinds of entity with corresponding laws it is less baffling. First there was the 
physical world. When sentient and reflective beings emerged out of that physical 
world then there was also a second world whose descriptions could not be in 
any general way reduced to physical world descriptions. Although neither 
philosopher will enjoy the comparison, Davidson’s theory of mental events and 
Popper’s first and second world seem to me to ride very close to each other. 
Every mental event is the occurrence of physical events, but, type of event by 
type of event, there is no reduction of descriptions of one to descriptions of 
the other. 

Popper’s third world is more conjectural. His idea is that there is a domain 
of human knowledge which is subject to its own descriptions and laws and which 
cannot be reduced to second-world events (type by type) any more than second- 
world events can be reduced to first world ones. Lakatos persists in the meta- 
phorical expression of this idea: “The products of human knowledge; propositions, 
theories, systems of theories, problems, problemshifts, research programmes live 
and grow in the “third world”; the producers of knowledge live in the first and 
second worlds’ (volume 2, p. 108).1 One need not be so metaphorical. It is a 


1 This is part of an account of “‘demarcationism’, which includes ‘conventionalism’ as a 
special case. In a curious footnote to my quotation Lakatos writes: ‘Most demarcationists 
agree that propositions are true if they correspond with facts, and thus subscribe to a 
correspondence theory of truth. (Some conventionalists may prefer the coherence 
theory).’ I take it that introduction of a third autonomous world at least makes possible 
that neither of these theories of truth is acceptable. At any rate Lakatos does not say 
that Ae subscribes to either of these common demarcationist assumptions, and I take 
his neutral stand-offish reporting to betray that he has a quite different goal with respect 
to truth. 
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difficult but straightforward question whether there is an extensive and coherent 
body of description of ‘alienated’ and autonomous human knowledge that 
cannot be reduced to histories and psychologies of subjective beliefs. A sub- 
stantiated version of a ‘third world’ theory can provide just the domain for the 
content of mathematics. It admits that mathematics 1s a produce of the human 
mind, and yet is also autonomous of anything peculiar to psychology. An 
extension of this theme is provided by Lakatos’s conception of ‘unpsychological’ 
history: ‘History of science and its rational reconstructions’ in volume 1. 


2.6 INTERNAL HISTORY 


Lakatos begins with an ‘unorthodox, new demarcation between “internal” and 
“external” history’ (p. 102), but it is not very clear what is going on. External 
history commonly deals in economic, social and technological factors that are 
not directly involved in the content of a science, but which are deemed to 
influence or explain some events in the history of knowledge. External history 
may include changes in the school system, the advent of Sputnik, or dadaism and 
the course of the Weimar Republic. Internal history is usually the history of 
ideas germane to the science and attends to the motivations of research workers, 
their patterns of communication and lines of intellectual filiation. The distinction 
is not very clear: standard internalists regard prosipography as the nadir of 
external history yet it is arguably only a sophisticated version of traditional 
enquiries into lines of filiation. But roughly speaking the distinction is clear 
enough. We have a spectrum ranging from Truesdell’s severely internal Archive 
for the History of the Exact Sciences to, for example, D. de Solla Price’s use of 
Polya urn models and citation counts to describe the spread of knowledge as 
an epidemic. The former appears to examine only the content of the science, 
while the latter seems to have nothing to do with it. 

Lakatos’s internal history is to be one extreme on this spectrum. It is to 
exclude anything in the subjective or personal domain. What people believed 
is irrelevant: it is to be a history of some sort of abstraction from what is said. 
It is, in short, to be third world history, the history of Hegelian alienated 
knowledge, the history of anonymous and autonomous research programmes. 
That poses a double question: whether there is some stable domain of laws 
about the third world which 1s a necessary condition for believing that there is 
a third world and secondly, whether such ‘normative reconstructions’ can 
properly be called history at all. 

These questions are of different magnitudes, and only one of them can be 
answered now. We shall have to wait and see whether talk of a third world 
turns out to be legitimate. At present it is only an ingenious suggestion; it 
will be a long time before we have before us enough irreducible truth about the 
growth of knowledge to justify this bit of emergentism. (There is of course 
nothing especially Popperian about the third world: Althusser and Quine are 
alike part of the act.) 

As for the other question, whether normative reconstructions are histories, 
the answer is a cautious ‘yes’. But they are only applied history: the past applied 
to the solution of a philosophical problem. History of science has to welcome 
the ecumenical moment and let a hundred histories bloom. There is no reason 
to accept Lakatos’s own maxim, that history of science without philosophy of 
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science is blind. At worst, to quote Kant rather than misparaphrase him, it is 
one-eyed." There is plenty of history to which no present philosophy is relevant. 
An example is the history of experimental work, a field neglected not only by 
Lakatos but also by historians at large, few of whom have any sense of what an 
experimenter is, nor could tell a good experiment from a bad one. No known 
philosophy -will remove the experimental blinders that affect our present 
generation of historians of theorising. Lakatos has no right to exclude various 
kinds of history, either by slanging it as ‘mob psychology’, ‘inductivism’ or what 
not, nor by his more common practice of sheer omission. The many histories 
teach us various things. The best of historians, when they do have philosophies, 
seem to have learned them from no philosopher. But by the same token that 
makes us reject Lakatos’s dismissal of much history, we have to welcome his 
own. use of the past. 

Unlike most writing of history Lakatos’s historiography has rules that are 
irritatingly simple to the trained historian. I shall describe them in my next 
section but first a remark on his idea of ‘internalism’. Internal history is a 
history of theory enunciated in sentences. Those sentences are comprised not 
only by the final research report, but also the tentative working out, the scribbles 
on Maxwell’s postcards, the notes in the journal of Lavoisier. The sentences 
include promulgations of what to do and why to do it. They include reactions 
to failure, confessions of reversal, crowings with success, although how these 
last are to be filed as ‘internal’ or ‘external’ is obscure. No matter how the 
selection procedure works, internal history remains the history of sentences and 
not (except figuratively) of thoughts or ideas. The good internal historian will 
not be the one who plucks a pretty idea from his cranium and smudges it down 
on the archives, claiming that was really what was going on. He will be the 
reader who can sieve out the decisive sentences in terms of which to construct 
generalisations that predict the occurrence of the rest of the sentences that 
comprise the internal history. Of course no one has ever succeeded in stating 
the right generalisations, but Lakatos did have some apparatus for getting on 
with the job. That is what ‘hard core’, ‘heuristic’, ‘monster-barring’, and the 
like are up to. He was also a master of the pointed quotation. Sometimes he 
abused this gift for polemical purposes but that is our payment for his extra- 
ordinary ability to single out sentences that make sense of the rest. As long as 
internal history of the kind urged by Lakatos remains a craft, the first condition 
of being an artisan is to be able to quote to precise effect. ‘Thus this well known 
feature of Lakatos’s work, apparent from the first in ‘Proofs and Refutations’, is 
not an adventitious feature of his style, but a part of its nature. 


27 RATIONAL RECONSTRUCTION 


Lakatos has a problem, to characterise the growth of knowledge internally by 
analysing examples of growth. There is a conjecture, that the unit of growth is 
the research programme (defined by hard core, protective belt, heuristic) and 
that research programmes are progressive or degenerating and, finally, that 


1 ‘Mere polyhistory is a cyclopean erudition that lacks one eye, the eye of philosophy.’ 
Immanuel Kant in his Logic, quoted from the translation by R. Hartmann and W. 
Schwartz, Bobbs-Merrill: Indianapolis and New York, 1974, p. 50. 
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knowledge grows by the triumph of progressive programmes over degenerating 
ones. To test this supposition we select an example which must prima facie 
illustrate something that scientists have found out. Hence the example should 
be currently admired by scientists or people who think about the appropriate 
branch of knowledge, not because we kow-tow to orthodoxy but because 
workers in a given domain tend to have a better sense of what matters than 
laymen. Having chosen an example we should read all the texts we can lay 
hands on, covering a complete epoch spanned by the research programme, and 
the entire array of practitioners. 

Within what we read we must select the class of sentences that express what 
the workers of the day were trying to find out, and how they were trying to 
find it out. Discard what people felt about it, the moments of creative hype, 
even their motivation of their role models; discard not only sociopolitics but 
also prosipography and Polanyi’s ‘tacit’ world of presuppositions and sensibility 
that is supposed to underlie the sombre content of the science. Having settled 
on such an ‘internal’ part of the data we can now attempt to organise the result 
into a story of Lakatosian research programmes. 

As in most enquiries an immediate fit between conjecture and articulated 
data is not to be expected. Three kinds of revision may improve the mesh 
between conjecture and selected data. First we may fiddle with the data analysis, 
‘secondly we may revise the conjecture, and thirdly we may conclude that our 
chosen case study does not, after all, exemplify the growth of knowledge. I shall 
discuss these three kinds of revision in order. 

By improving the analysis of the data I do not mean lying. Lakatos made a 
couple of silly remarks in his ‘falsification’ paper, where he asserts something 
as historical fact in the text, but retracts it in the footnotes, urging that we take 
his text with tons of salt. The historical reader is properly irritated by having 
his nose tweaked in this way. No point was being served. Lakatos’s little joke 
was not made in the course of a rational reconstruction despite the fact that 
he says it was. He was constructing some examples that he wanted to make 
look sharp. He used Prout’s hypothesis (that atomic weights of elements are 
integral multiples of that of hydrogen) to illustrate the case of a research 
programme wallowing, but staying afloat, in a sea of anomalies. Prout was a 
medical man and amateur chemist who discovered HCl in the stomach, did 
much useful work on biological chemicals, and did some hack publicising in a 
Bridgewater Treatise. Lakatos made Prout into a significant figure who knez 
that chlorine has a weight of 35.5 but still promulgated his hypothesis of 
integers. A footnote corrects this by saying that Prout thought Cl was 36. In 
fact, Prout had so fudged the numbers that he got 36 and believed it (an 
interesting case in itself, for the fudging is so manifest in Prout’s brief paper). 
Lakatos’s point would have been perfectly well served by the facts rather than 
his fiction, for many able analytical chemists, especially in Britain, did persist 
in Prout’s hypothesis after it was ‘known’ that Cl had to be about 35.5. It was 
unnecessary for Lakatos to spruce up the example by distorting the facts; my 
point is, however, that he was merely improving on an example, and not engaging 
in a rational reconstruction of the sort used to test his conjecture about research 
programmes. 

When Lakatos’s conjecture and the selected data do not fit, one should, just 
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as in any other enquiry, first try to reanalyse the data. As I say, that does not 
mean lying. It may mean simply reconsidering or selecting and arranging the 
facts, or it may be a case of imposing a new research programme on the known 
historical facts. The strongest example of the latter case is furnished by the 
paper, ‘Cauchy and the Continuum’ that I shall discuss in Section 2.8 below. 
From the very beginning of his work on mathematics Lakatos had been puzzled 
by Cauchy’s account of convergence and its subsequent evolution, but he could 
make no sense of it until he proposed that Cauchy was following a quite different 
line of research than has commonly been attributed to him. That reanalysis 
of the data, whether it be right or wrong, is at least provocative and well 
exemplifies the possibility of making new sense of old chestnuts. 

If the data and the Lakatosian conjecture cannot be reconciled, two options 
remain. First, the case history may itself be regarded as something other than 
the growth of knowledge. Such a gambit could easily become monster-barring, 
but that is where the constraint of external history enters. He can always say 
that a particular incident in the history of science fails to fit his model because 
it is ‘irrational’, but he imposes on himself the demand that one should allow 
this only if one can say what the irrational or external causal element is. External 
elements may be political pressure, corrupted mores or, perhaps, sheer stupidity. 
Lakatos’s histories are normative in that he can conclude that a given chunk of 
research ‘ought not to have’ gone the way it did, and that it went that way 
through the interference of external factors not germane to the programme. In 
concluding that a chosen case was not ‘rational’ it is permissible to go against 
current scientific wisdom. But although in principle Lakatos can countenance 
this, he is properly moved by respect for the implicit appraisals of working 
scientists. I cannot see Lakatos willingly conceding that Einstein, Bohr, Lavoisier 
or even Copernicus was participating in an irrational programme. “Too much 
of the actual history of science’ would then become ‘irrational’ (volume 1, p. 172). 
We have no standards to appeal to, in Lakatos’s programme, other than the 
history of knowledge as it stands. To declare it to be globally irrational is to 
abandon rationality. 

In the paper on Copernicus Lakatos describes a revision in his methodology 
due to Elie Zahar. As first stated a progressive programme is one whose 
theoretical content keeps in advance of its empirical content; that is, it is good 
at making novel predictions. Now we read that the facts need not be strictly 
speaking novel. They may have been known before. They count as ‘novel’ 
if they were not considered as part of the original inspiration of the programme, 
but are surprisingly delivered for free as the investigation proceeds—a role 
claimed, by Lakatos, for the Balmer formula in Bohr’s programme. 

Lakatos accepted this revision in order to give a rational account of why 
Copernicus’s programme superseded Ptolemy’s. It is a considerable softening of 
Lakatos’s doctrine. In the strict sense of the words it is almost always possible 
to tell when a phenomenon is ‘newly discovered’. But in the new relativised 
form we are led dangerously near to second-world considerations of whether 
Bohr or whoever was thinking of the Balmer formula in the back of his head, as 
he was working on other details. Facts are now counted as novel if research 
workers had not thought of including them in their domain beforehand. But 
although one can sometimes show that so and so did have something in the 
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back of his mind by consulting his notebooks and postcards, it is not in general 
the sort of thing that one can settle nor, in my opinion, is it the sort of thing 
that Lakatos ought to concern himself with. In short J think a hard line Lakatos 
that can’t explain the Copernican revolution is better than a soft Lakatos that 
can ‘explain’ too much. Perhaps other possible revisions would lead to less 
softening. One of these is the idea that research programmes have hard cores. 
The picture of hard cores and protective belts is familiar enough (without 
these pornographic metaphors) in a wide range of philosophical writings, most 
notably those of Quine. But it is a picture that detailed investigation does not 
substantiate very well, and I expect that if Lakatos’s programme continues to 
be deployed such tales of concentric spheres will be as fully abandoned in 
epistemology as once happened in astronomy. 


2.8 WHIG HISTORY: THE CASE OF CAUCHY 


Whig history, to use the magisterial phrase that Herbert Butterfield made a 
term of opprobrium, is the practice of reading the past in terms of the present 
to which it led. It thinks of events as important then, chiefly if they are part of 
a chain that leads to what we now value. Lakatos’s enquiries are Whig history 
with a vengeance, for he will never single out a case study to test his methodology 
unless it is a piece of science that current wisdom deems to be progress. The 
past is rigidly interpreted in terms of what happened later. I do not object to 
this because Lakatos’s work is applied history. We choose incidents that 
exemplify the growth of knowledge and help test the philosophical conjectures. 
Yet this history is Whiggish in an extreme way, for it can happen that only the 
most up-to-date of discoveries will make sense of a seeming counterexample to 
Lakatos’s conjectures. ‘Cauchy and the Continuum’ illustrates this. 

Lakatos had long been attracted to Cauchy’s definition of convergence. It 
seemed a nice case: here 1s Cauchy, propagandist for a new Euclidean rigour 
in the calculus, formulating definitions that even at the time were known ‘not 
to apply to all examples’. The concept of convergence was put right at earliest 
by the €, ô approach of Bolzano and Weierstrass, yet Cauchy lost no reputation 
by the counterexamples. Lakatos wanted to do a story on this not unlike what he 
wrote for Euler, and there is a chapter about this in his thesis, published in the 
1976 book Proofs and Refutations. Even one of his pet analytical phrases, ‘finding 
the hidden lemma’, is taken from a paper by Seidel that is correcting Cauchy. 
Yet in detail the story could never be made to fit, and Lakatos regularly sought 
for something to make sense of it. 

The solution turned up inadvertently from Abraham Robinson’s non-standard 
analysis that gives a precise concept answering to a Leibnizian idea of infinitesi- 
mals. Lakatos jumped at it: there must have been two competing research 


1 Abraham Robinson [1966]: ‘. .. Leibniz’ attitude towards infinitely small and infinitely 
large quantities in the Calculus remained basically unchanged during the last two 
decades of his life. He approved entirely of their introduction, but thought of them as 
ideal elements, rather like the imaginary numbers. These ideal elements are governed 
by the same laws as ordinary numbers...’ (p. 261). Non-standard analysis shows how 
to make sense of these infinitesimals. For a hint at the idea, note that familiar axioms 
for arithmetic are satisfied in a domain of ‘unnatural numbers’, objects that go on 
after integers, with n*, n*’, n*” ... occurring later than any integer. The common 
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programmes. One of them culminated in Weierstrass and the e, ô foundations 
taught in College Calculus. It has no truck with infinitesimals. But there was 
also a Leibnizian programme of which Cauchy, says Lakatos, was a proponent. 
Cauchy’s definition of convergence is supposed to be fine for those parts of 
the calculus for which the newly refined doctrine of infinitesimals applies. For 
reasons indicated in the footnotes below I find the claims of this paper ingenious 
but pretty implausible. Certainly they are insufficiently argued, a fact which 
must have inclined Lakatos to withhold it from publication. Lakatos had taken 
his early research and added Robinson’s new idea without properly reworking 
and rethinking the whole thing, so I read this strictly as work in progress. 


2.9 SPEEDY PHILOSOPHISING 


The difficulty in seeing what Lakatos is doing should not blind us to the fact 
that often, when we know just what he is doing, he is going much too fast. 
Consider pages 14-16, volume 1. This provides the whole of his refutation of 
‘dogmatic falsificationism’, the view that most of us have attributed to Popper, 
and is best summarised by the tag of Braithwaite’s quoted on page 13, ‘Man 
proposes, nature disposes’. Lakatos says this attitude rests on two false assump- 
tions. First, that there is a psychological borderline between speculative 
propositions and observational ones, and, secondly, that observational 
propositions can be proved by (looking at) the facts. For the past fifteen years 
these assumptions have been jeered at, but we ought also to have argument. 
Lakatos’s ‘arguments’ are dismayingly facile and ineffective. He says that a 
‘few characteristic examples already undermine the first assumption’. In fact 
he gives one example, of Galileo using a telescope to see sun-spots, a seeing 
which cannot be purely observational. That is supposed to refute, or even 
undermine, the theory-observation distinction? 


definition of division may be consistently applied so that 1/n*, 1/n®’ and so forth are 
different ‘numbers’ which are less than any assignable rational fraction. Robinson was 
able to derive a remarkable account of the Calculus that made use of such infinitesimala 
in a way reminiscent of some of the suggestions of Leibniz and his successors, and 
which is superficially quite different from the foundations developed in the middle 
of the nineteenth century. 

1 Robinson quotes several passages from Cauchy, and writes: ‘Whatever the precise 
picture of an infinitely small quantity that may have been in Cauchy’s mind, we may 
examine his subsequent definitions and see what they amount to if we interpret the 
infinitely small and infinitely large quantities mentioned in them in the sense of Non- 
standard Analysis.’ He thinks this interpretation fits rather well; then ‘we proceed to 
consider a famous error of Cauchy’s, which has been discussed repeatedly in the 
literature.’ The ensuing discussion by Robinson is Lakatos’s starting point. But 
Robinson asserts only a possibility about Cauchy, while Lakatos turns this into a 
positive assertion of historical fact. (/bid., pp. 270-1). Note some differences between 
what Robinson does and Lakatos’s use of Robinson. In a statement about the con- 
vergence of a series of functions f,(x), Robinson takes x to range over the standard 
real numbers while Lakatos takes it to range over the full extended real number system 
including infinitesimals. For this and other reservations, see footnote 5 to Feferman’s 
paper (op. cit. n. 2, p. 385 above), One is also tempted to think that external history is 
germane to the history of Cauchy and his theorem. ‘His character,’ to quote one recent 
author, ‘was flawed by bigotry and an extremely strong desire to display his intellectual 
superiority over others.’ One fears that this may have more to do with the reaction to 
Abel than any submerged Leibnizian program of infinitesimals, 
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As for the second point, that one can look and see whether observation 
sentences are true, Lakatos writes in italics, ‘no factual proposition can ever 
be proved from an experiment . . . one cannot prove statements from experience 
... This is one of the basic points of elementary logic, but one which is 
understood by relatively few people even today’ (volume 1, p. 16). Such an 
equivocation on the verb ‘prove’ is particularly disheartening from a writer who 
has done a good deal to remind us of the several senses of the verb, that the 
verb properly bears the sense of ‘test’ (the proof of the pudding is in the eating, 
galley proofs) and that such tests often lead to establishing facts (the pudding 
is stodgy, the galleys full of misprints). 

That is the totality of refutation of dogmatic falsification, except for an 
afterthought that ‘exactly the most admired scientific theories simply fail to 
forbid any observable state of affairs’ (p. 16). In support of this we get not fact 
but “an imaginary case of planetary misbehaviour’. This makes the Duhemtan 
point that one can commonly patch up a theory by adding auxiliary hypotheses; 
when one of the hypotheses pans out, that 1s a triumph for the theory, while if 
it does not, we just go on trying to get more auxiliaries. Thus, it is claimed, the 
theory does not forbid anything, for we get an inconsistency with observation 
only through intervening hypotheses. This too is ill-argued, and illustrates 
another kind of sloppiness. From the historical fact that hypotheses have 
sometimes been saved it is inferred that hypotheses can always be saved. Once 
again this is argued by an imaginary version of real life. In the case of Prout’s 
hypotheses (Section 2.7 above) about atomic weights, one can go on insisting 
that chlorine has been imperfectly purified, and the real stuff has weight 36, 
although actual samples come out at 35.5. Lakatos gives us an imaginary state- 
ment, ‘If seventeen chemical purifying procedures p,, py... Piy are applied 
to a gas, what remains will be pure chlorine’. Presented schematically we at 
once see that we can reject this, demanding that pg be applied. But in real life 
it does not work like that. Worried that British (integral) atomic weights were 
at odds with continental ones, various committees were set up, and Edward 
Turner was commissioned to get to the heart of the matter. He regularly 
obtained 35.5, and for a while he was criticised, e.g. Prout suggested that the 
silver chloride might be carrying some water with it. A method was found to 
eliminate that possibility. It soon became clear to the community of British 
scientists that chlorine had an atomic weight of about 35.5. More sophisticated 
laboratories in Paris, still intrigued by the possibility that hydrogen is the 
building block of the universe, and shocked by having found that the old 
determinations for carbon are wrong, tried it all over again. But after much 
labour there was no possibility that chemical chlorine had an atomic weight of 36. 
‘There was no way to save the hypothesis by hoping for better chemical purifica- 
tion, and that was that. As it turned out, the hypothesis was on the verge of the 
truth, but that required a quite different research programme, and the idea of 
physical separation of the elements. 

Lakatos had a marvellous ability to get an inconoclastic grip on past science. 
It is probably a good rule of thumb, in reading him, to disbelieve him as soon as 
he moves from real examples to fairy stories. He tells fancies just when he is 
losing his grip. He does not think it worth arguing that the theory-observation 
distinction must be abandoned, or that theories can always be backed up by 
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auxiliary hypotheses to save them from the facts. So he recites fictions which are 
supposed to lead us away from what we naively know to be true. Because there 
has been so much speedy propaganda to the contrary, let me assert that of course 
there is a rough and ready distinction between theory and observation, and 
of course we often look and see what is true. Of course some theories are just 
false and, after diligent attempts at patching, have to be abandoned. Sometimes 
this is because one is not bright enough to think of saving auxiliary hypothesis, 
but more often it is because there are none. Moreover sometimes programmes 
just die, not because there is some rival in the field, but because they are wrong. 
There is no ‘theory saying that atomic weights are just more or less arbitrary 
numbers, many but not all of them easily rounded off to multiples of hydrogen. 
That was just a fact. It is still a fact, although we have a deeper theory about 
atomic weight, arising not from chemistry but physics, which makes us look 
back with a little indulgence on the ingenious guess made by a lightweight 
chemist a century and a half ago. 


2.10 CATACLYSMS IN REASONING 


Peirce defined truth as what is reached by an ideal end to scientific enquiry and 
thought that it is the task of methodology to characterise the principles of 
enquiry. There is an obvious problem: what if enquiry should not converge 
on anything? Peirce, who was as familiar in his day with talk of scientific 
revolutions as we are in ours, was determined that “cataclysms’ in knowledge 
(as he called them) have not occurred. Theories have had their ups and downs, 
and some have been replaced by others, but this is all part of the self-correcting 
character of enquiry. Lakatos has exactly the same attitude as Peirce. He was 
determined to refute the doctrine that he attributed to Kuhn, that knowledge 
changes by irrational ‘conversions’ from one paradigm to another. 

I do not think that a correct reading of Kuhn gives quite the apocalyptic air 
of cultural relativism that Lakatos found there.1 A good many people now write 
as if Kuhn and Lakatos were telling parallel versions of a similar story, and this 
eclectic attitude may be welcomed. But there is a really deep worry underlying 
Lakatos’s antipathy to Kuhn’s work, and it must not be glossed over. It is 
connected with one of Feyerabend’s apergus, that Lakatos’s accounts of scientific 
rationality at best fit the major achievements ‘of the last couple of hundred years’. 

A body of knowledge may break with the past in two distinguishable ways. 
By now we are all familiar with the possibility that new theories may completely 
replace the conceptual organisation of their predecessors. Lakatos’s story of 
progressive and degenerating programmes is a good stab at deciding when such 
replacements are ‘rational’. But all of Lakatos’s reasoning takes for granted 
what we may call the hypothetico-deductive model of reasoning. A much more 
radical break in knowledge occurs when an entirely new style of reasoning 
surfaces. The force of Feyerabend’s gibe about ‘the last couple of hundred years’ 
is that Lakatos’s analysis is relevant not to timeless knowledge and timeless 
reason, but to a particular kind of knowledge produced by a particular style 
of reasoning. That knowledge and that style have specific beginnings. So the 
Peircian fear of cataclysm becomes: might there not be further styles of 


1 See my review of Kuhn’s collection of essays, The Essential Tension, forthcoming in 
History and Theory. 
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reasoning which will produce yet a new kind of knowledge? Is not Lakatos’s 
surrogate for truth a local and recent phenomenon? 

I am stating a worry, not an argument. Feyerabend makes sensational but 
implausible claims about different modes of reasoning and even seeing in the 
archaic past. In a more pedestrian way my own Emergence of Probability contends 
that part of our present conception of inductive evidence came into being only 
at the end of the Renaissance. A. C. Crombie, from whom I take the word 
‘style’, writes of six distinguishable styles (one of which is the statistical method).1 
Now it does not follow that the emergence of a new style is a cataclysm. Indeed 
we may add style to style, with a cumulative body of conceptual tools. These 
are matters which are only recently broached, and are utterly ill-understood. 

But they should make us chary of an account of reality and truth itself which 
starts from the growth of knowledge when the kind of growth described turns 
out to concern chiefly a particular knowledge achieved by a particular style of 
reasoning. 

To make the matter worse, I suspect that a style of reasoning may determine 
the very nature of the knowledge that it produces. The postulational method of 
the Greeks gave a geometry which long served as the philosopher’s model of 
knowledge. Lakatos inveighs against that domination of the Euclidean mode. 
What future Lakatos will inveigh against the domination of the hypothetico- 
deductive mode and the theory of research programmes to which it has given 
birth? One of the most specific features of this mode is the postulation of 
theoretical entities which occur in high-level laws, and yet which have experi- 
mental consequences. This feature of successful science becomes endemic only 
at the end of the eighteenth century. Is it even possible that the questions of 
objectivity, asked for our times by Kant, are precisely the questions posed by 
this new knowledge? If so, then it is entirely fitting that Lakatos should try to 
answer those questions in terms of the knowledge of the past two centuries. 
But it would be wrong to suppose that we can get from this specific kind of 
growth to a theory of truth and reality. To take seriously the title of Lakatos’s 
proposed book, ‘the changing logic of scientific discovery’ is to take seriously 
the possibility that Lakatos has, like the Greeks, made the eternal verities 
depend on a mere episode in the history of human knowledge. 


IAN HACKING 
Stanford University 
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. A PHYSICALIST ACCOUNT OF PSYCHOLOGY* 


The term ‘Physicalism’ goes back at least to Carnap and Neurath, in the days 
of the Vienna circle. At first it was introduced in the context of epistemology 
and of the theory of meaning. The thesis at first was that all meaningful 
predicates, including those of psychology, are definable in terms of predicates 
applicable to observable physical objects, but Carnap had to weaken ‘definable’ 
to ‘reducible’, and indeed the hypothetico-deductive method enables us to 
introduce theoretical concepts in an even looser way than that of Carnap’s 
reduction sentences. Besides this semantic thesis, Carnap also held a ‘second 
thesis of physicalism’.1 This was that all laws of nature, including those of 
biology, and sociology, are consequences of physical laws. I think it would be 
useful to add ‘and generalisations of natural history’: one cannot deduce 
biochemically some property of a blood cell, for example, without also some 
description of the structure of the blood cell, even though it is always open to 
us contingently to identify the elements of this configuration with purely 
physically-defined entities. This second thesis of physicalism takes us into the 
realm of metaphysics: we no longer have a merely semantic or epistemological 
thesis. Indeed, in the 19308, Carnap had (as had Reichenbach)? the essential 
idea of the identity theory of mind and brain.® Carnap said that the sentence 
‘Mr A is now excited’ is about the internal state of the body of Mr A. This 
internal state is a physical one, but ‘is characterised only in terms of possible 
effects, namely, those which may be taken as symptoms for the state’. However, 
Carnap’s anti-metaphysical attitude led him to deny that this identity thesis 
was a factual one, and he admitted that a dualistic language can be constructed 
without coming into conflict with the laws of logic or with empirically known 
facts.* In a sense I could agree with this, since theories are underdetermined by 
the facts, and dualism need not conflict with observable facts, though in the 
light of these facts it can become totally implausible because of considerations 
of simplicity and ontological economy. (Carnap would presumably have wanted 
to replace my ‘implausible’ with some such expression as ‘inconvenient’.) 
Abstracting from Carnap’s positivism, we can therefore see his “second thesis 
of physicalism’ as a metaphysical thesis. This comes near to the way in which 
I myself (among others) have tended to use the word ‘physicalism’. I came to 
use ‘physicalism’ as a convenient extension of ‘materialism’. Nineteenth century 
materialists were physicalists, but they had a very limited ontology for physics. 
Like them, I wanted to say that there is nothing over and above the entities 
of physics and nothing which does not behave solely in accordance with physical 
laws. However, not all the entities postulated by physics need be material 
particles. For example, we may need to postulate space-time points, and Quine 
has given good reasons for holding that such things as numbers and sets (which 
are not spatio-temporal at all) are hypothetical entities of physics just as 
electrons or photons are. So the universe may contain more than mere matter. 
* Review of K. V. Wilkes [1978]: Physicalism. London and Henley: Routledge and Kegan 
Paul. £4.75. Pp. 142. 
1 See for exarnple Carnap [1963], p. 883. 
1 H., Reichenbach [1938], § 26. 
? See Carnap [1963], p. 53, Carnap [1932] and Carnap [1959]. 
“ Carnap [1963], p. 886. 
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I say that my ontology is just that of physics. There is indeed a possible worry 
that there will be revolutions in physics, just as there have been in the past. 
So what is the ontology of physics? Is it that of present day physics or is it that 
of some future physics, and if so, which one? At least for the purposes of 
philosophy of biology and philosophy of psychology, we can tie physics to 
present day physics. Any revolutionary changes in physics are not likely to 
affect our understanding of ordinary bulk matter or even of molecules.* Whatever 
happens about quarks or even more recondite entities water will still be H,0 
and neurons will still be understandable in terms of the workings of proteins, 
nucleic acids, and so on. If, as seems likely, the brain is essentially a neuronal 
network, future revolutions in physics will be irrelevant to the mind-body 
problem. 

For me then, physicalism is an ontological thesis, and it implies a monistic 
solution to the mind-body problem. K. V. Wilkes, in her Physicalism, does not 
use ‘physicalism’ in quite my way. She distinguishes the issue of monism from 
that of physicalism. In my usage of ‘physicalism’, of course, a person could be 
a monist without being a physicalist, for example if he were a phenomenalist 
or an idealist, but in Wilkes’s usage, a person could also be a physicalist 
without being a monist. (Monist in the context of the mind-body problem. 
By ‘monist’ she clearly does not mean someone who thinks, like Parmenides, 
that the universe contains no diversity.) She says that the ontological question 
of ‘what mental states, events, and processes precisely are’ is ‘the issue of 
monism’.® She distinguishes this from a second and ‘scientific’ question which 
is the question ‘no matter what mental phenomena may prove to be, do we need 
to make ineliminable reference to them when explaining behaviour?’* This 
distinction between science and ontology is a little bit reminiscent of Carnap’s 
distinction between external and internal questions, so that for Carnap his 
monistic account of the relation between mind and body is a choice of the most 
convenient language for science. However, J think that Wilkes is nearer to me 
than to Carnap: even if she wants to open up a gap between a scientific thesis 
(physicalism) and an ontological thesis (monism), she nevertheless seems to me 
to treat ontological issues as factual ones. Moreover, her distinction between 
physicalism and (physicalist) monism becomes less clear later in her book when 
she says, ‘Certainly the physicalist wants to argue for a monism of some kind”.5 
The monistic theory she supports is Eliminative Materialism. 

According to Wilkes’s account, parallelism (or epiphenomenalism) could be 
a physicalist theory but not a monistic one. The epiphenomena would not 
figure in the explanations of scientific psychology. However, as she recognises, 
there would still be ‘the amazing brute fact of a mental-physical coincidence 
that would be inexplicable in physical terms’.? Now I hold that considerations 
of simplicity (and hence of Occam’s razor) are part of the inductive logic of 
science itself. Not only does science depend on the Humean rule ‘Expect the 
future to be like the past’, but it also requires the rule ‘Expect nature to be 
simple, consistently with allowing for the empirical facts’. I do not know how 
to justify the second rule, but then no one has succeeded in justifying the first 


1 See Smart [1978]. 2 See Feinberg [1966]. 
` Wilkes [1978], pp. 9-10. t Ibid., p. 10. 5 Ibid., p. 101. 
# See her remark about parallelism on p. 101 of her book. 7 Ibid., D. 101. 
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(Humean) rule either, and I do not think that Popper has succeeded in his interest- 
ing attempt to sidestep Hume’s problem by arguing that science is a matter of 
refutation, not of confirmation.’ Thus even if the few and debatable empirical 
tests between Newtonian gravitational theory (as modified by special relativity) 
and Einstein’s general relativity had not been possible at all, I think that 
scientists might have been justified in believing Einstein’s general theory on 
grounds of theoretical economy—for example because of the theory’s ability 
to explain the identity of tnertial and gravitational mass. 

Wilkes denies that the identity theory of mind and brain does enable us to 
achieve any economy, as compared with epiphenomenalism. She says that we would 
have ‘precisely as many laws of identity between the mental and the physical as 
there were formerly laws of correlation’.* (No doubt this is one reason, among 
others why she prefers eliminative materialism to the identity theory.) But her 
reasoning here seems to be mistaken. General statements of identity should not be 
regarded as laws. As Robert L. Causey has remarked,’ a law is something for 
which we might sensibly ask for a further explanation. (Of course there will be 
some laws which are in fact not explicable by means of still deeper laws.) But 
it does not seem sensible to ask for an explanation of a statement of identity. 
It is, of course, sensible to ask for our reasons for asserting the identity, but 
as Causey has emphasised, not all giving of reasons is explanation. This is obvious 
in the case of particular identity statements. We do not need to explain the 
identity of the morning star with the evening star, though we do have to give 
reasons for asserting the identity. In the case of general statements, it is true 
that ‘Any F is identical with some G’ implies “(x) (Fx = Gx)’ which is in the 
form of a law statement and which may need explanation. Nevertheless, this 
law statement may be one of our reasons for asserting the general identity 
statement, and the identity statement does not say anything, over and above 
our reasons for asserting it, that requires explanation. Contrary to Wilkes, and 
_ Jaegwon Kim, to whom she refers,* I hold, therefore, that identity statements 
do achieve theoretical economy as compared with statements of correlation. 

As was noted earlier, Wilkes’s route to monism is not the identity theory 
but eliminative materialism. We need not now to be able to replace (say) talk 
of pains by talk of processes in the brain, but the logical possibility that we 
should one day be able to do so is enough, she thinks. ‘So long as this is agreed to 
be logically possible,’ she says, ‘he [the eliminative materialist] is entitled to 
say that pains just are brain processes, much as a physicist may say that tables 
are nothing over and above clouds of molecules.’ But this looks more like the 
identity theory after all. 

At any rate, if Wilkes could accept the view that general identity statements 
are not laws, she would have less reason for preferring eliminative materialism 
to the identity theory. However, another reason may come from her views about 
the anomalousness of the mental as described in ordinary non-scientific language. 
Here she is at one with Donald Davidson. Wilkes says that ordinary mentalistic 
talk is that in which we try to explain particular actions, for example, ‘why 
Flossie flounced out of a party only ten minutes after she arrived’.* Such a 
common sense explanation will typically refer to a person’s wants and beliefs. 


1 See, for example, Popper [1963], ch. 1. 7? Wilkes [1978], p. 12. 
3 Causey [1977]. 1 Kim [1966]. s Wilkes [1978], p. 103. ° Tbid., p. 41. 
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People are open-ended systems, and any attempt to state laws of their behaviour 
will be initiated by the need for ceteris paribus clauses which cannot be cashed 
more specifically. She quotes Davidson: “Too much happens to affect the mental 
that is not itself a systematic part of the mental.’? 

There is no non-arbitrary way of identifying beliefs and desires, and so 
Wilkes holds that there is no ‘mental ontology’.® (Recall Quine’s ‘No entity 
without identity’) According to Wilkes, scientific psychology does not suffer 
from this defect, and this is because it does not try to explain particular pieces 
of behaviour, but breaks man down into sub-systems, such as ‘man as problem 
solver’, which can be broken down still further, to ‘man as pattern recogniser’, 
‘man as chess player’, etc.4 Consider experiments on pattern recognition or on 
memorising nonsense syllables. Any reliable generalisations about human 
behaviour are to be sought in scientific psychology of this sort and not in 
common sense psychology with its reliance on the intensional concepts of the 
propositional attitudes. ‘Thus the correlations that (according to Wilkes’s version 
of physicalism) must in principle be deducible from a physicalist basis are not 
couched in common sense psychological terms. 

According to Wilkes, scientific psychology employs concepts of a functional 
sort, and such concepts are compatible with physicalism. Psychology is 
functionalist in the way in which engineering is. Consider a mechanism for 
pattern recognition (psychology) or a washing machine (mechanical engineering). 
We can have a schematic flow diagram in terms of functional units, and then the 
reduction to physicalism arises when we identify the functional units with 
physical mechanisms. This is not to say, of course, that a given function need 
always be performed by the same sort of physical mechanism. As Wilkes points 
out, a timing device might be mechanical in one brand of machine and electronic 
in another. However in any particular case a functional unit can be identified 
with a physical unit. Such physical units need not be spatially or anatomically 
distinct. Thus there may be wires and components common to two functionally 
distinct electronic circuits. Normally a functional explanation will be quite 
complicated and will have to specify a hierarchy of functional units. 

It is worth pointing out here that there is an intermediate position between 
the assertion that there are universal (inter-specific) correlations between 
psychological functions and their physical embodiments, and the assertion that 
there are no general correlations between functions and physical embodiments 
at all. As James Hopkins has remarked, there could be more restricted 
identities.” Moreover, U. T. Place has remarked? that ordinary dispositional 
concepts, as when we ascribe a belief to someone, have a sort of universality, 
even though they are restricted to a particular temporal segment of a particular 
person, and so there is no objection to identifying a particular belief (say) with 
a (possibly rather transitory) state of a person’s brain. Even here, therefore, 
it does not follow that identity statements about function and embodiment of 
function need be of the token-token sort: even here, type-type identities are 
still possible. Or at least, I would say, if we can neglect Davidsonian arguments 
about the indeterminacy of belief and desire: at least the argument from 


* Davidson [1970], p. 99. * Wilkes [1978], p. 42. 
? Quine [1969], P. 23. t Wilkes [1978], p. 57. 
5 Hopkins [1978], p. 227. $ Place [1978]. 
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functionalism does not show that there are not type-type identities, though their 
scope may, of course, be more or less circumscribed. 

Wilkes herself is concerned with scientific psychology, which she thinks can 
avoid the intensionality of our ordinary common sense psychological discourse. 
However, with what I suspect may be some inconsistency, she holds that 
scientific psychology is ‘continuous’ with common sense psychology. Can the 
extensional be continuous with the intensional? She contrasts psychology with 
biochemistry: there is no rudimentary biochemistry which is inescapable in our 
daily social existence.1 The relation between scientific psychology and common 
sense psychology is, she says, no more puzzling than ‘the relation between 
ordinary language talk of chairs and tables and the physicists’ talk of molecules, 
protons, and neutrons’. She finds this relation difficult but suggests that an 
Austinian analysis of such locutions as the ‘really’ in “Tables are really only 
collections of molecules’ might help. I am sceptical of the value of such 
Austinian analyses here: I think that in ontology, we should concentrate not 
on words like ‘really’ but (as Quine has urged) on the existential quantifier. 

Wilkes contrasts the term ‘information’ as used in scientific psychology with 
the same term as used in ordinary language, where it is intensional, just as 
‘belief? is. She says rather puzzlingly that in psychology ‘‘‘Information” is an 
intensional term... but it is treated purely extensionally’.* Perhaps it would 
be better to say that in scientific psychology ‘information’ is not intensional at 
all but that it is purely extensional just as it is in the information theory of the 
communications engineer. Again, she has a metaphor of inner and outer circles 
of psychology, with intensionality riding only on the outer circles.? If she is 
talking of sctentific psychology, I think that her previous arguments show that 
intensionality is absent everywhere. 

However that may be, I can agree with Wilkes that intensionality is no barrier 
to physicalism, whether the word ‘physicalism’ is used in her way or in mine. 
Wilkes goes on to discuss some other objections to physicalism. Sometimes it is 
said that robots or computers can model certain aspects of human behaviour 
(for example, chess playing) but they cannot themselves be said literally to 
engage in this behaviour (for example, to themselves actually play chess). They 
are ‘programmed’ to do these tasks, and it is objected that to do something 
because it is programmed to do it is not really to do it. An adding machine, it 
is said, does not really add. In discussing this objection, Wilkes distinguishes 
Al (artificial intelligence) devices and S (simulation) devices. An AI chess 
player will not be expected to play chess in the way we do. A human chess player 
may be said to enjoy chess, to play chess anxiously, and so on, but there is 
no reason why we should be able to say this of an AZ computer. What about 
an S computer? Wilkes has already given reasons for saying that the ambitions 
of scientific psychology are limited. She has ‘denied at some length that 
physicalism should seek to explain our use of ordinary-language mental terms’.* 
What it seeks to do is to explain the most fundamental and pervasive human 
capacities, and these are explained in technical vocabulary. These fundamental 
capacities do themselves help to explain our more complex activities (which 
are described in ordinary language), but Wilkes has argued that these complex 
activities are not the primary objects of scientific analysis. 


1 Wilkes [1978], p. 47. ? Ibid., p. 54. 3 Ibid., p. 66. * Ibid., p. 72. 
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Wilkes then goes on to discuss the objection that computers could not be 
sentient. She points out that feeling, enjoying, suffering, and so on are not 
things that we do, and so a fortiori are not things that an S computer could be 
programmed to do. Nevertheless, an S computer could perhaps be ascribed 
non-programmable capacities.1 We could also imagine a minimally programmed 
robot in a society of similar robots, so that it could learn and develop as a 
human baby does. However, to construct such an S robot, we would have to 
know already the constitution and workings of the humans it simulated. The 
fantasy does not therefore help with the mind-body problem. Wilkes says that 
‘we may [might?] as well have continued to argue for the logical possibility 
of physicalism’.? She points out that the useful scientific function of computer 
simulation is to simulate the exercise of particular capacities of humans, and 
the invention of a suitable simulation can lend weight to functional hypotheses 
put forward by psychologists. ‘Thus, the computer model serves as a theoretical 
bridge between psychology and neurophysiology. 

Wilkes investigates the question of how far down in the hierarchy of functions 
an S robot can simulate human performance. Thus a washing machine as a 
whole simulates (in a general sort of way) a human who washes clothes, but 
there is nothing in the human which carries out the subordinate function 
performed by a spin dryer. Furthermore, a simulation may be judged good or 
bad not only by functional criteria, but also by the criterion of structural 
isomorphism.? The computer may have discrete structural components which 
correspond to discrete anatomical structures in the brain. Wilkes points out 
that demands for structural isomorphism cannot be carried to an extreme: the 
computer would have to have the same sort of cellular structure as the human 
brain, and at an even deeper level the cells themselves would have to have the 
same biochemical structures as living cells. In which case we might as well have 
studied the simulated neurophysiological structures themselves. Wilkes suggests 
that at the intermediate level at which computer simulation is worth while, 
it is a poor tool for studying sentience (as opposed to sapience). 

How then can the physicalist account for sentience? What would a robot have 
to be like to be sentient? Wilkes can find no unitary description of sentience. 
After rejecting incorrigibility, privileged access, and immediacy as criteria for 
sentience, she makes a preliminary sorting of states of sentience into two broad 
classes: (1) ‘intransitive’ sentient states, bodily sensations such as pains and 
tingles; (2) ‘transitive’ sentient states such as the seeing of a sunset or the being 
aware of the position of one’s leg. After a rather complex discussion, which 
I am not sure that I have quite understood, she assimilates the intransitive 
states to the transitive ones: pain is the perception of damage.* Despite what 
she says, I still want to go the other way. She puts down the inclination to go 
the other way to a foundationalist epistemology, but there seem to me to be 
more respectable motives than this. The fact of consciousness can get lost if 
we too easily assimilate consciousness to perception. (Though it may come down 
to something of the sort in the end—a sort of reflex perception of one’s inner 
processes.) If perception is thought of as the being caused to believe by means 
of the senses,® then it would seem that one can perceive without being conscious 


1 Ibid., D. 74. * Ibid, p. 77. 3 Ibid., p. 81. 4 Ibid., p. 103. 
CT. Armstrong [1968], Pitcher Tiei, 
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at all: consciousness at least seems to involve a higher order perception of one’s 
inner (cerebral) processes, not just a coming to believe about the external world 
or the non-cerebral parts of one’s body. 

Even so, there is still something puzzling about consciousness, and therefore 
I think that it is, in a way, somewhat to Wilkes’s credit that in the end she seems 
a bit indecisive as to how to deal with consciousness. At one place she embraces 
eliminative materialism, and at another she treats consciousness as an ex- 
planatory concept.? She allows? that many will feel dissatisfied with her treatment 
of consciousness. She refers to her final chapter in which she argues that the 
Greeks (and, in particular, Aristotle) seemed to manage very well without any 
notion equivalent to that of the philosopher’s notion of consciousness (or of 
‘sensation’). At one point, she quotes from W. I. Matson, who, of course, has 
also argued for a similar conclusion. She makes interesting comparisons 
between Aristotle’s hierarchical theory of psuche and her functionalism. It is a 
category mistake, she says,® to identify the psychological and the physiological, 
because it is a mistake to identify form and matter, or a function and a particular 
sort of material performer of the function. In reply, I would say that psycho- 
logical entities, such as beliefs and desires, are correctly said to cause actions, 
and so surely they cannot be mere abstract forms or functions. The attractions 
of an identity theory are stronger than Wilkes allows here, and the distinction 
between form and matter seems even less helpful when one wants to give a 
philosophically satisfying account of sensations, such as toothaches. But as 
Wilkes says, Aristotle was primarily concerned with sapience, not with sentience. 

Aristotle was thus not bothered by the questions which brought us to mind- 
body dualism or its materialistic rival, the identity theory. Unlike Ryle, whose 
philosophy was quite Aristotelian in spirit, Aristotle had no previous history 
of Cartesianism to get out of his system. (Though there were earlier dualisms. 
What about Plato’s Phaedo?) That a great philosopher ignores or partly ignores 
a problem can be explained in other ways than by saying that there is really 
no problem there at all. Wilkes connects Cartesianism with the reification of 
ideas, which comes from scepticism and a foundationalist epistemology. It seems 
to me that even if one accepts (as one should) a holistic epistemology without 
hard foundations, there are still problems: the facts of sentience are not as easily 
got over as Wilkes suggests, and to avoid Cartesianism, one must supplement 
her theory of functionalism by an explicit willingness to assert the identity 
of mental and physiological events, states, and processes. Nevertheless, overall, 
I am much more in agreement with Wilkes than in disagreement, and I think 
that her book is one for which the philosophical public should be very grateful.® 


J. J. C. SMART 


The Australian National University 


1 Wilkes [1978], p. 103. 

3 Ibid., p. 106. 3 Ibid., p. 108. 

t See Matson [1966] and [1976]. 5 Wilkes [1978], p. 121. 

6 I might as well mention a few minor points which do not affect her argument. 

On p. 10 of her book she refers to a strict Leibnitzian identity in contrast with ‘a looser 
theoretical identity’. I think that there is only one clear concept of identity (the 
Leibnitzian one). On p. go, line 12, ‘to’ should be ‘from’. On p. 34 a phenomenon of 
surface tension is incongruously listed with gravitational phenomena. On p. 102 we 
have the C-fibre myth: see Puccetti [1977]. 
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Truth and Meaning (TM) contains thirteen essays most of which are related to 
the work of Donald Davidson. A rough division can be made between essays 
which try to elucidate or criticise Davidson’s programme, and ones which try to 
contribute to it. I shall concentrate on the former group, although mention must 
be made of the ones by Woods and Wiggins, belonging to the latter group, as 
both thoughtful and interesting. It will be convenient to begin with a sketch 
of Davidson’s position. 

Davidson has claimed that there is an important connection between knowledge 
of a theory of truth for a language L and an understanding of L. A theory of 
truth in the style of Tarski entails for each sentence s of L a biconditional of the 
form: | 


(T) xis true in L if and only if p 


in which ‘x’ is a structural description of s, and ‘p’ its translation into the meta- 
language (assuming that L is not part of the metalanguage). No semantic 
vocabulary is assumed, since the key semantic notion of satssfaction is defined. 
But the theory does assume translatability from the object language to the 
metalanguage, thereby seemingly both guaranteeing and trivialising the claim 
that to know a truth theory for L is to understand L. Davidson, therefore, 
cannot assume translatability; but the price of not doing so is the use of primitive 
semantic vocabulary. Thus a truth theory in the style of Davidson is (a) (see 
Foster, TM, p. 8) a set of axioms which contains semantic terms, and which 
implicitly defines a truth predicate for L by entailing for each s of L an appropri- 
ate T-sentence. In doing so it satisfies what Davidson calls ‘Convention-7”. 
Moreover, (b) whereas Tarski employs an unrelativised truth predicate, for 
Davidson truth is relativised to a speaker S and time f. So that the axioms have 
to entail for each s of L a sentence of the form: 


(T) (SX) x is true in L for S at t if and only if p. 


It seems difficult to say anything in general about the relation of ‘p’ to ‘x’ in 
(T’), though presumably (c) it is determined by the extension of the relativised 
truth predicate. There are, however, other requirements: (d) the axioms of the 
truth theory must be finite (Davidson [1970], p. 178) to ensure that the truth 
theory yields insight into the semantic structure of sentences. Further, (e) the 
statement of the truth conditions for s must draw upon the same concepts as 
the sentences whose truth condition they state (ibid., p. 179 and [1973], p. 82) 
to ensure that the metalanguage does not have a richer ontology than the object 
language—though it will have additional semantic vocabulary. 
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Two further conditions have to be added to ensure, in the homophonic case, 
that the right hand side of a T-sentence is a translation of its left hand side.* 
One (f) is that the theory must be ‘supported or verified by evidence plausibly 
available to an interpreter’ (Davidson [1974], p. 316). This evidence must be 
statable ‘without essential use of such linguistic concepts as meaning, interpreta- 
tion, synonymy and the like’ (#did.). Further, it must not contain detailed 
descriptions of speakers’ beliefs, intentions, etc., since belief and meaning are 
interdependent. But we can take as evidence the fact that there are certain 
sentences which speakers hold true. Using such evidence we can construct 
hypotheses about the conditions in which sentences are held true by members 
of a speech community. However, in doing so, we must adopt the methodological 
principle of charity. This (g) involves assigning truth-conditions to sentences 
of L in a way which makes ‘native speakers right as often as is plausibly possible, 
according of course to our own view of what is right’ (ibid., p. 324). 

The connection then between a knowledge of a truth theory for L and an 
understanding of L is ‘that someone is in a position to interpret the utterances 
of speakers of a language L if he has a certain body of knowledge entailed by a 
theory of truth for L—a theory that meets certain formal and empirical con- 
straints—and he knows that this knowledge is entailed by such a theory’ (TM, 
p. 34). These constraints, of course, would be (a)-(g) above, or something similar. 

Many criticisms of this subtle theory are levelled in TM, and papers by Jardine 
and Potts in Formal Semantics of Natural Language (FNSL) also contain 
forceful criticisms. One crucial problem arises because Davidson’s theory 
assumes a grasp of a concept of truth. This cannot, in radical interpretation, be 
the recursively defined notion of truth-in-Z which the linguist aims to construct. 
Which notion then is assumed? This question is ably discussed by C. Peacocke 
in his ingenious paper ‘Truth Definitions and Actual Languages’ (TM, p. 162). 
His strategy is, firstly, to say what makes an interpreted language L a language 
of a population P; and, secondly, to demarcate those sentences of L which 
say something about the world. No use, of course, is to be made of semantic 
vocabulary. The proposal then is, roughly, that a sentence s is true in the language 
of P if and only if it belongs to the demarcated class, and the condition associated 
with it in the interpreted language which is the language of P holds. 

Peacocke discusses, only to reject them, various attempts to solve the first 
task. One of these, employing the notion of convention, is similar to a proposal 
of Loar’s (‘Two Theories of Meaning’, TM, p. 138) which is in some ways 
preferable to Peacocke’s, since it takes factors such as illocutionary force into 
account. An idea of the solution Peacocke proposes can be gained from one he 
rejects. This develops the idea that if L is the language of P, members of P can 
understand each other’s utterances of sentences of L. It proposes that an 
interpreted language should have two components. The first, a finitely axiomatised 
theory in English has an uninterpreted predicate “7”, and yields theorems of the 
form ‘T(s) iff p’ in which s is a structural description of a sentence of L, and p 
an English sentence. The second component is a function ‘f’ from sentences 


to propositional attitudes. An interpreted language L is then the language of P 
only if 


1 An excellent account of the need for these conditions and the defects of the theory 
without them is given by Foster; see especially TM, pp. 7-23. 
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Vo Vp(tr, (To) = p) & f(o) = $. > . it’s common knowledge in P that 
Vx e P Vy e P (x notices that y utters o . >. x takes y’s utterance of o as p.f. 
evidence that y -s that p, and «’s reason for doing so is in part that it’s 
common knowledge in P that every member of P expects that Vee P VyeP 
(x notices that y utters oe. > . x takes y’s utterance of a as p.f. evidence that 


y #-s that p))). 


The problem, Peacocke argues, is that the proposal attributes to each member 
of P a propositional attitude to infinitely many sentences, but does not do so 
on the basis of finitely many propositional attitudes to finitely many kinds of 
expressions. Peacocke’s preferred solution has just such a basis; though only 
on the restrictive assumption that L has finitely many atomicsentences. And there 
are other difficulties. Surely it is false that any two members of P can understand 
each of the other’s utterances, because of differences of dialect, in intelligence, 
etc. (cf. Ziff [1960], p. 3). And, for similar reasons, it is false that each member 
of P has a propositional attitude to every sentence of the language of P. Further, 
the account does not allow for the fact that s may be uttered with different 
ulocutionary force on different occasions. 

Peacocke’s proposal is, of course, alien to Davidson’s enterprise because it 
attributes fine-grained propositional attitudes to the members of P; Davidson, 
it will be recalled, restricts himself to the ‘coarse’ one of holding true. However, 
Peacocke is surely right that ‘true’ in ‘holds true’ isn’t semantically inert, so 
that some explanation of its use is needed. But I’m not sure that it follows that 
we must give an alternative account of a truth predicate for the native language 
for which the linguist tries, in radical interpretation, to construct a truth 
definition. Presumably, a sentence of the form ‘Native sentence s is held true 
in circumstances c’ would be a sentence of whatever language the linguist was 
using as a meta-language. Hence, this sentence’s truth conditions would be 
given by a truth definition for that language, not by one for the native’s language. 
It would then seem possible for the linguist, when constructing a truth definition 
for the native language, to be guided by the sort of coherence considerations 
involved in the Principle of Charity. The motto could be: correspondence at 
home, but coherence abroad. 

The question of the relation between knowledge of a truth theory for L and 
the ability to interpret it is raised in some form or other by Loar, Potts, 
Strawson, Foster and Dummett. Potts, for example, argues (FNSL, p. 248) 
that model theory is essentially an exercise in translation. He is, of course, right 
that the metalanguage sentences must themselves be understood. But that 
does not make the construction of a truth theory for Z an exercise in translation. 
Certainly according to Davidson, one who aims to construct a truth theory 
adequate for interpretation does so in a way that ensures that the truth conditions 
for each sentence s will also serve as a translation. But in no familiar sense of 
‘translate’ is he translating from the object language to the metalanguage—in 
radial translation there 1s no pre-existing dictionary to appeal to. Anyway, 
Davidson’s claim isn’t that to interpret sentences of L one must have constructed 
a truth theory for it; but rather that one has knowledge of a certain kind which 
one knows is entailed by such a theory. 

One response of Davidson’s at this point is difficult to evaluate. He argues 
that while an interpreter knows that ‘his knowledge consists in what is stated 
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by a T-theory, a T-theory that is translational . . . there is no reason to suppose 
that the interpreter can express the knowledge in any specific form, much less 
in any particular language’ (TM, p. 37). But to grant this is hardly to grant that 
there is no language in which he can express what he knows, which is the real 
issue. This issue raises two questions: (i) Could a speaker of English know, for 
example, that (a) ‘it is raining’ is true in English for S at ¢ if and only if it is 
raining near S at ?, without being able to say in some language what he knows? 
And, (ii) Could a speaker of English know that (a) follows from a T-theory for 
English which is translational, without being able to say in some language 
what he knows? The answer to (1) is surely ‘yes’; most English speakers cannot 
state the truth conditions of the sentences they use (cf. Dummett, TM, p. 70). 
The answer to (1i) is much less clear. But an affirmative answer to (i) is, I think, 
enough to establish Davidson’s point that knowledge of L need not, anyway, 
involve knowledge of how to translate from one language to another. 

In fact whenever a relativised truth predicate is employed translation can 
hardly be the issue, as Davidson points out (TM, p. 37). Davidson’s analysis of 
action sentences, for instance, attributes truth conditions to them, involving 
quantification over events, which are very different from those suggested by 
their ‘surface’ grammar. This raises the question, ably discussed by Strawson 
(‘On Understanding The Structure of One’s Language’, TM, p. 189), of the 
explanatory value of attempts to explain our understanding of sentence types 
‘by reference to a true or underlying structure which differs more or less 
radically from the superficial grammatical form of the sentences in question’ 
TM, p. 189). Lf, following Strawson, we call ordinary action sentences S S 
sentences, and their proposed Davidsonian logical forms D S sentences, then 
one view which Strawson rightly rejects is that our actual ability to understand 
a given S S sentence can be explained in terms of our understanding of the 
corresponding D S sentence to which it is related by a set of transformation 
or translation rules. This would be to explain that with which we are familiar 
and understand, in terms of the unfamiliar and only potentially understood. 
A different view, to which Strawson is more sympathetic, is that an S S sentence 
and its D S analogue are merely notational variants; presumably, their relation 
would be like that of p = q to C p g. His objection to this view is that ‘we have 
available an alternative account (the Adverbial) of the elements and modes 
of combination involved, according to which these elements and combinations 
are not at all unperspicuously represented by the formal arrangements as they 
stand’ (TM, p. 197). I am not clear what is the alternative account; and, anyway, 
Davidson could hardly accept the ‘notational variant’ view, since a D S sentence 
radically modifies the grammar of its S S analogue—prepositions, for instance, 
become predicates. So there is a serious question how a given S S sentence 
is related to its D S analogue; a question, incidentally, which the remarks on 
logical form in Evans’s paper (‘Semantic Structure and Logical Form’, TM, p. 
199) bear on in an illuminating way. But Strawson is surely right that a S S 
sentence might be more perspicuous than its D S counterpart. As Kripke says, 
‘Surely no one really thinks that the “true” structure of quantification over 
individuals is quantification over sequences . . .’ (TM, p. 357). 

One issue raised by Strawson’s paper is whether Davidson is trying to describe 
what someone who understands English must know in order to understand it, 
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or, more modestly, to describe something which, if one knew it, would enable 
him to understand English? Strawson seems to have no quarrel with the latter, 
more modest, enterprise, but assumes that Davidson’s concern is with the former. 
Pm not clear what the truth is. Certainly, at times, what Davidson says explicitly 
suggests only the modest enterprise (cf. Davidson [1974], p. 313 and TM, p. 34). 
This, indeed, might be the only reasonable one, until we had a clearer idea 
whether there is something which all speakers of English must know in order 
to understand English sentences. 

But even read modestly Davidson’s theory would be vulnerable to the charge 
made, in Dummett’s important paper (“What is a Theory of Meaning II’, TM, 
p. 67), that Davidson’s theory assumes we could have knowledge of truth 
conditions which it is impossible for us to have. Dummett’s paper is at once an 
attack, and a prologomena to an alternative theory. 

The attack argues that Davidson’s theory attributes knowledge of truth 
conditions to someone who understands a language which cannot be accounted 
for by any of the possible models of our knowledge of truth conditions. Dummett 
recognises only two such models: the first (A) involves an ability explicitly to 
state the truth conditions of a sentence. The second (B) consists of an ability 
to recognise that the sentence’s truth conditions obtain whenever (sic) they do 
(TM, p. 80). The problem that arises at this point for Davidson’s theory does 
so, Dummett thinks, because a natural language contains numerous sentences 
which are not effectively decidable. Model-B is clearly inadequate in such cases; 
if there is no effective procedure we could not be sure that we could always 
tell when a condition obtained. Moreover, we could account for all such cases 
on model-A only by making what, in Dummett’s view is the implausible 
assumption that undecidability arises only from the use of verbal explanations 
to expand the vocabulary of a language—though on a Quinean model of language 
this seems to me not all that implausible an assumption. 

An example of an effectively undecidable statement would be (b) ‘John was 
brave’, said of someone who had no chance to exemplify the virtue in his 
lifetime. Dummett recognises three responses to the question of (5)’s truth or 
falsity: (1) it is not necessarily either; (#) bravery has a physiological correlate 
which John either had or did not have; (#7) though bravery does not have a 
physiological correlate, it is nevertheless something which a person either has, 
or does not have. Position (tti) precludes reducing the truth of (6) to that of any 
other set of statements; so that if true, it is what Dummett calls a ‘bare truth’. 
But if our model of what knowledge of bare truth consists in is that of an 
observation statement, and this in turn is model-B, then position (11) is plainly 
unsatisfactory. So if (t), which seems to be Dummett’s preferred position, were 
the only alternative to (#1), we would have an argument for adopting (i). However, 
plainly (t) is not the only alternative unless (#) can be disposed of; and, as far 
as I can see, the realism implicit in (#) is acceptable to Dummett (TM, p. 95). 
So it is unclear what the argument is at this point. Anyway, Dummett’s 
confidence that there are only two models of our knowledge of truth conditions 
is one that can hardly be shared. If, as he acknowledges, one cannot associate a 
specific practical ability with the grasp of each sentence, the question whether 
someone knows the truth conditions of a given sentence s may well depend in 
part on his grasp of at least a fragment of the language which contains s. So 
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someone’s grasp of some of the logical connections between (b) and various 
other sentences of English might well constitute evidence that he knew the 
truth conditions of (5). Finally, Dummett’s argument is very sensitive to the 
view taken of the nature of recognitional abilities. If for instance, one argued, 
reasonably enough, that such an ability only involves recognising a given 
characteristic more often than not when it manifests itself in favourable circum- 
stances, it would be difficult to rule out position (tti), even assuming that model-B 
is the only relevant one. 

Dummett’s alternative prolegomena takes as its model the intuttionist’s 
explanation of the logical constants. To understand a statement p, it is not 
necessary that one should be able to decide whether p is true (or false); but it is 
necessary that if it has been established that p is true (or false), one can tell that 
this has been established. Connectedly, to assert p is not to assert that its truth 
has been established, but only that it has, or could be. 'The difficulty with this 
proposal seems to be that either it is implausibly restrictive, or else vulnerable 
to precisely the sort of attack which Dummett mounts against Davidson. To 
see this, consider the problems of a language learner if to assert p is to assert 
that it has been verified, or could be. Suppose p to have been asserted for the 
latter reason. If what is claimed is that there exists a certain procedure which, 
if it were followed, would establish p’s truth (or falsity), then the claim is 
surely very restrictive—particularly if to understand p one has to understand 
the procedure—though subsequent practice could show what the procedure is. 
If, on the other hand, the claim is merely that such a procedure might be found, 
then there is no longer any clear connection between future practice and the 
procedure, and hence, between future practice and meaning. What the language 
learner needs to know are the statement’s assertability conditions, and he is 
provided with no clue to these. Assertability conditions have become at least as 
elusive as truth-conditions. 

Kripke’s long paper on substitutional quantification (‘Is There a Problem 
About Substitutional Quantification’, TM, p. 325) raises many important issues. 
Evidently he has been exasperated by the use made of formal results (or alleged 
results) in the philosophy of language. One of his morals is that ‘philosophers 
should have a better sense of both of the power and limitation of formal and 
mathematical techniques’ (TM, p. 413); and it is difficult to think of a better 
way of improving this sense than by reading his paper. 

His discussion illuminates at many points. He repeatedly points out, for 
instance, that ‘Homophonic theories are intelligible only if the object language is 
antecedently understood’ (TM, p. 356), which, of course, is not the case in 
radical translation. He makes a number of remarks which raise in an acute form 
the question how one could be sure that (e) (cf. p. 411) is satisfied, since the 
right-hand side of Davidsonian T-sentences often seem to employ much richer 
conceptual resources than their left-hand sides. A Davidsonian truth-theory 
for demonstratives might, Kripke suggests, yield: ‘This is bigger than that” is 
true in the mouth of a speaker s at a time t iff the thing he refers to by “this” at t 
is bigger than the thing he refers to by “that” at ? (TM, p. 349). Perhaps the 
concepts introduced on the right-hand side which are not present on the left 
are all ‘semantical’; but it is difficult to be sure, and hence to be sure that new 
ontology is not being introduced. For instance, must every language which 
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contains a sentence translatable by “This is bigger than that’ contain a word 
translatable by ‘thing’? 

But the main issue is substitutional quantification. Davidson has maintained 
that a truth-theory employing substitutional quantification cannot satisfy 
Convention-T. ([1973], p. 79). Whilst Wallace has argued that a substitutional 
interpretation reduces to a referential one, since an adequate truth theory 
employing the substitutional quantifier will contain the conceptual resources to 
yield the referential interpretation (Wallace [1970], p. 131). Kripke shows that 
this is not always so, 1.e., that there is no general result, and that a substitutional 
theory can satisfy Convention-T'in the sense ‘that to teach ¢ of L a ¢’ of M [the 
metalanguage] is associated, where ¢’ does not contain the predicate (T(x) and 
T ($) = ¢’ is provable’ (TM, p. 346). There are indeed special cases-—in which a 
totally defined denotation function is given for all the terms of L, and all of its 
formulae are transparent—in which a result like Wallace’s holds. (TM, p. 351). 
One interesting question would perhaps be whether T-theories for sizable 
fragments of natural language could satisfy these constraints. Even so the 
abiding question posed by Kripke’s essay is of the naturalness of the very severe 
constraint imposed by Davidson’s programme. 

Apart from the generally high standard of the essays it contains, TM benefits 
immeasurably from their concentration on Davidsonian themes. What emerges 
is a discussion which cannot fail to deepen one’s understanding of the issues 
raised by Davidson’s theory. One feature of the theory is that it suggests a 
framework for collaborative effort—perhaps the nearest thing there is to a 
philosophical research programme—and this book is essential reading for any 
informed decision about that framework. 

By contrast FSNL contains twenty-five papers somewhat loosely grouped, 
according to their theme, into six sections. There are many interesting papers, 
giving one an invaluable conspectus of the many competing approaches to the 
formal semantics of natural language. I can here do no more than mention some 
of the papers which I found especially rewarding. David Lewis’s ‘Adverbs of 
Quantification’ discusses a group of adverbs which includes ‘always’, ‘sometimes’, 
‘never’, ‘usually’, ‘often’ and ‘seldom’. He rightly rejects the proposals that they 
are, respectively, quantifiers over times and events. He argues convincingly 
that instead they should be treated as quantifiers over cases—though I’m not 
clear how his proposal works for certain sentences containing two such adverbs, 
e.g., ‘Usually if a man is late for work he almost never tells anyone.’ His paper 
should be read in conjunction with Altham’s and Tennant’s impressive paper 
‘Sortal Quantification’ which, in an appendix, argues that Lewis’s adverbial 
quantifier can be represented as an (#, 1) quantifier in their sense. Barbara 
Halla Partee’s tightly argued paper ‘Deletion and Variable Binding’ is concerned 
with pronominalisation and deletion in so-called ‘Super-Equi NP deletion’ 
cases. Her solution convincingly posits two sorts of pronoun, bound variables, 
and pronouns of laziness. She argues that Super-Equi NP deletion should be 
defined on variables and is obligatory, and that certain sentences which might 
appear to be optional surface forms arise by different processes involving 
pronouns of laziness. A most impressive paper is J. A. W. Kamp’s “Two ‘Theories 
About Adjectives’. He rejects the view that an adjective is a function mapping the 
meaning of a noun phrase onto that of another, on the ground, which seems right, 
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that the theory cannot provide an account of comparatives and superlatives. 
His preferred solution takes the problem of vagueness seriously. Very roughly, 
the idea is that we say that x is at least as intelligent as y not only if, given our 
present criteria, we should say that x is intelligent if y is, but 1f this remains true 
on all aharpenings of those criteria. It is then a straighforward business to define 
‘x is more intelligent than y’. However, a doubt arises about the assumption 
Kamp has to make ‘that all individual cases of vagueness can be resolved, though 
not all at once’ (FSNL, p. 142). The contrary case is argued for in a fairly 
convincing way in Crispin Wright’s paper ‘Language Mastery and the Sorites 
Paradox’ in TM; and while Kamp may be right, there is certainly room for 
argument on this point. 

DAVID HOLDCROFT 

Uraversity of Warwick 
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Of the thirty-six articles in these volumes, twenty are in English, the rest in 
German. Twenty-nine are new. I shall restrict myself to discussion of a few of 
the new essays. I shall, however, draw attention to some other recent work on 
Frege which bears upon the concerns of these essays. 

Very little of the current literature on Frege concerns the historical background 
to his work, but the essays by Sluga and Dudman in volume x contain some 
very valuable historical insights. In ‘Frege as a Rationalist’ (vol. 1) Sluga — 
explodes the idea that Frege’s philosophical opponents were the German 
idealists, and shows instead that the target of his antipsychologism was the 
positivistic, science-dominated philosophy popular in Germany during the 
mid nineteenth century. He also argues that Frege’s early thought was influenced 
by the linguist Trendelenberg, who stated in outline the programme of the 
Begriffsschrift. An influence on Frege which has long been suspected is the 
teachings of Lotze; no direct evidence of Frege’s debt to Lotze is available, but 
Sluga shows how strong the parallels between the two are. He also claims that 
interpretations of Frege which emphasise his platonism are historically mis- 
leading (p. 29); like Lotze, Frege was concerned to establish only the objectivity 
of abstract objects, and denied that they had any reality (Wirklichkeit) (P. 37, 
see also Sluga [1977], p. 232). I have two comments to make. First, in denying 
Wirklichkeit to numbers, courses-of-values, etc., Frege is not thereby denying 
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that they exist. Wirklichkeit for Frege is a metaphysical category logically 
distinct from existence (see e.g. Frege [1893], p. xxv). What guarantees the 
existence of these objects in Frege’s system is the notorious basic law 5, of 
which he was always slightly suspicious even before the discovery that it led 
to paradox. Thus the question of the existence of these objects was one about 
which Frege was much exercised. Secondly, Sluga overlooks the fact that, at 
least in his later writings, Frege does ascribe Wirklichkeit to some abstract 
objects (Gedanken). (See my [1980] and compare Dummett [19768].) 

In ‘Einige Einseitigkeiten des Fregeschen Logikbegriffs’ (vol. 2) Gottfried 
Gabriel also draws attention to some similarities between Frege and Lotze 
concerning the logical form of judgments, especially the way in which general 
propositions are to be seen as constructed out of particular ones (pp. 81~2). 
This constitutes an interesting historical addition to Dudman’s analysis of 
the ways in which Frege’s logic constitutes progress over that of Boole (‘From 
Boole to Frege’, vol. 1). For, as Dudman notes, one of the insights which enabled 
Frege to solve problems which Boole could not was the recognition that general 
statements ought to be formulated so as to enable the generality to be deleted 
and the remaining part treated truth functionally. Boole’s logic consisted of 
two quite distinct parts; his propositional logic and the logic of categoricals. 
Dudman shows how Boole’s representation of both kinds of constructions by 
equations gives only a superficial unity to the two logics, for when Boole 
manipulates the propositional part he employs polynomials rather than equations. 
Worse, Boole’s system is inadequate to capture obvious logical relations between 
the two parts. In this sense Frege’s logic, which gives a uniform treatment 
of both kinds of sentences while at the same time displaying perspicuously the 
relations between them, appears as an achievement comparable to Newton’s 
unification of terrestrial and celestial mechanics. 

Dudman points to Frege’s early recognition of the priority of a logic of 
sentences over that of concepts (vol. 1, p. 134), and this, I believe, is a valuable 
clue to the origins in Frege’s logical investigations of the famous dictum that 
‘words have meaning only in the context of a sentence’. Michael Dummett 
has argued elsewhere (see e.g. his [1973], p. 495) that Frege abandoned the 
principle after the Grundlagen. In his ‘Frege’s Context Principle Revisited’ 
(vol. 3) Michael Resnik argues against Dummett’s interpretation of that principle 
but agrees that Frege abandoned it after the Grundlagen (see p. 46). I find this 
rather surprising given Resnik’s own interpretation of that principle. While 
Dummett interprets it as a principle which tells us the ontological status of 
abstract objects (see his ibid., p. 497), Resnik sees it rather as a ‘methodological 
principle’ which directs us to investigate the nature of numbers by examining 
judgments in which number words occur. As he points out (in rather the same 
way as Sluga) Frege’s interest was not so much in the question ‘How can we 
know that numbers exist?’, but in the question (which I take to be an epistemo- 
logical one) ‘How are numbers given to us?’ —in other words, what sort of 
knowledge do we have of numbers? When Resnik comes to the supposed evidence 
that Frege abandoned the principle he produces a quotation from 1896 in which 
Frege says that for the purposes of deductive inference, it is essential that the 
same expression occur in two different sentences and that it have the same 
reference in each. Hence the expression ‘has reference which is independent 
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of the parts of the sentence’. Now Dummett might regard this as evidence that 
Frege abandoned the context principle because he (Dummett) takes the principle 
to say that abstract objects exist only in so far as names of abstract objects 
occur in true sentences. On his reading of the context principle abstract terms 
acquire denotation via their occurrence in true sentences rather than the other 
way around. But Resnik’s methodological interpretation of the principle seems 
to me uncontradicted by any suggestion that words have reference independent 
of their occurrence in sentences. For Resnik’s interpretation can be filled out 
in the following way: the principle tells us only that, if we want to provide 
a correct definition of some informal notion like ‘number’, we should do so 
via a consideration of paradigmatic statements in which number words occur. 
Only in this way can we ensure the material adequacy of our definition; that 
it should capture the important intuitive properties of numbers. It is then a 
further question to decide what is it for an abstract term to have a reference. 
Resnik also takes issue with Sluga who has elsewhere argued that Frege 
continued to adhere to the context principle throughout his later work. Sluga 
quotes a passage from 1919 which Resnik says does not imply the context 
principle but ‘is more a passage on how Frege’s methodology differs from his 
predecessors’ (p. 47). But since Resnik explicitly interprets the principle as a 
methodological one, I find this objection to Sluga again surprising. 

In volume 3 there is a new Postskript to Tugendhat’s “The Meaning of 
“Bedeutung” in Frege’ (reprinted here in German), in which the author defends 
his position against the criticisms made by Dummett (see his [1973], pp. 199 ff.). 
Tugendhat’s original suggestion was to interpret the Bedeutung of an expression 
as what he calls its ‘truth-value-potential’; the contribution which it makes to 
the truth values of sentences in which it occurs. (A similar suggestion was made 
at roughly the same time by Sluga, see his [1971], p. 268.) To this Dummett 
replied that Bedeutung, so interpreted, would cease to be anything extralinguistic, 
whereas it is clear that, for names at least, Frege and T'ugendhat are at one in 
wanting the Bedeutung of a name to be the object it names (see pp. 56-7 of 
Tugendhat’s original essay). I do not find an answer to this in T'ugendhat’s 
reply, though he refers twice to the objection (pp. 66 and 69). Tugendhat does, 
however, point to a weakness in Dummett’s account of Frege’s ‘realism’; 
sometimes Drummett describes this realism as the doctrine that the sentences of 
our language have determinate truth conditions independent of our knowledge 
of them (see e.g. his [1973], p. 466), and he assumes that this doctrine is in- 
consistent with the denial that the Bedeutung of an expression is the external 
object which it names. But as Tugendhat points out (p. 67) realism in this sense 
can demand only that the Bedeutung of an expression is not such that the truth 
value of a sentence containing that expression is determined by our own decision 
or convention. Frege’s realism, thus construed, is consistent with objective 
idealism. (Here I think Tugendhat’s argument lends support to Sluga’s historical 
claims about Frege’s philosophical background in Kantian idealism.) Elsewhere 
Dummett adds to the condition of determinate truth values another condition; 
that sentences be true or false by virtue of their constituent expressions having 
certain extra-linguistic entities as referents (see vol. 1, p. 232). It seems to me 
that only the second of the two conditions constitutes anything which can be 
called Frege’s realism; the first on its own would be better described as Frege’s 
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“objectivism’. Tugendhat counters Dummett’s criticism with one of his own; 
Dummett spendssome time criticising Frege’s view that incomplete expressions 
have reference but, according to Tugendhat, Frege did not mean the Bedeutung of 
an incomplete expression to be taken as a ‘quasi~object’ to which the expression 
refers, so Dummett’s criticism is misguided (p. 69). But, as Dudman has remarked 
elsewhere ([1972], pp. 25-6), the idea that there is no entity which is the reference 
of a concept word will not stand exposure to Frege’s texts. Another of 
Dummett’s criticisms was to point out that sentences have the same truth-value- 
potential when they have the same truth value only if we restrict the language 
to extensional contents. T'ugendhat’s reply to this I found obscure (pp. 66-7). 

Dummett’s own contribution to this collection is an interesting essay ‘Frege 
on the Consistency of Mathematical Theories’ (vol. 1) which offers an elucidation 
and defence of Frege’s objection to Hilbert’s programme for consistency proofs 
in mathematics. I have only two criticisms of a very minor nature to make. 
First, it is surely misleading to say (p. 230) that Frege could have known that 
there is another way of proving consistency than by exhibiting a model, t.e. 
by showing that every finite subset of an (infinite) set of axioms has a model. 
True, Frege did not need the compactness theorem to see this, but given that 
there was almost no metamathematical background for him to draw upon at the 
time, it is entirely unsurprising that he missed this point. But perhaps the remark 
on the next page indicates that Dummett’s comment was not meant quite 
seriously. Secondly, on page 233 we find a short characterisation of Frege’s 
theory of sense which, with its claim that knowledge of sense must sometimes 
be such that it is not always linguistically expressible, seems to owe more to 
Dummett’s own investigations into the tisary of meaning than to Frege’s 
texts (see Dummett [19762], p. 80). 

Many of the papers have no real historical content, being instead analytical 
investigations of Fregean problems in philosophical logic. Perhaps the most 
important is David Wiggins’s contribution (vol. 2) which faces the problem 
about proper names raised by the attempt to explain informative identities via 
an appeal to differences.in sense between co-referring expressions. He gives 
an account of the sense of proper names which avoids tying the sense to that of a 
corresponding description. This has the consequence that co-referring names 
have the same sense but still accounts for the difference between informative 
and non-informative identities involving names. (I take it that this theory is 
similar to one recently proposed by John McDowell ([1977], see especially 
pp. 161-2, and compare with Wiggins thid., p. 232).) 

As is perhaps usual with collections of this size, not all the articles are of 
equal worth, though a generally high level of scholarship is maintained, 
particularly in so far as a great deal of attention is paid to the Nachlasse. Some 
of the essays do little more than recount Frege’s views, while others compare 
the views of various authors on Fregean themes. But the three volumes together 
contain enough good material to be of considerable interest. Volume three has 
an excellent bibliography of work on Frege. 

One final remark. In ‘The Evolution of Frege’s Logicism’ (vol. 1) Bynum 
suggests that Carnap’s notes of Frege’s lectures are in the U.C.L.A. library 
(p. 284, n. 14). They are actually in the Carnap Archive at the University of 
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Pittsburgh, where they are being translated from the German shorthand. I under- 
stand that interested scholars will be able to examine this material in 
the near future. 

GREGORY CURRIE 


University of Otago 
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